Diferență între revizuiri ale paginii „SDPT Lab 9”
(→Tasks) |
|||
| Linia 24: | Linia 24: | ||
# '''Extend the Docker build environment.''' Update your <code>Dockerfile</code> to install <code>valgrind</code> alongside the tools already present. Rebuild and verify with <code>valgrind --version</code>. The sanitizers themselves ship with the compiler, so no separate package is needed - but confirm by running <code>echo "int main(){}" | g++ -fsanitize=address -x c++ - -o /dev/null</code> inside the container; if that succeeds, ASan is available. | # '''Extend the Docker build environment.''' Update your <code>Dockerfile</code> to install <code>valgrind</code> alongside the tools already present. Rebuild and verify with <code>valgrind --version</code>. The sanitizers themselves ship with the compiler, so no separate package is needed - but confirm by running <code>echo "int main(){}" | g++ -fsanitize=address -x c++ - -o /dev/null</code> inside the container; if that succeeds, ASan is available. | ||
| − | |||
# '''Add the Sanitize build type to CMake.''' In your <code>CMakeLists.txt</code>, add logic that, when <code>CMAKE_BUILD_TYPE</code> equals <code>Sanitize</code>, sets compile and link options for <code>-fsanitize=address,undefined</code> plus <code>-O1 -g -fno-omit-frame-pointer</code>. Make sure both compile ''and'' link options include the sanitizer flag - if you forget the link side, you will get cryptic linker errors about missing <code>__asan_*</code> symbols. Test locally: <code>cmake -B build-san -DCMAKE_BUILD_TYPE=Sanitize</code>, then build, then run a test binary directly to confirm it produces output that mentions ASan. | # '''Add the Sanitize build type to CMake.''' In your <code>CMakeLists.txt</code>, add logic that, when <code>CMAKE_BUILD_TYPE</code> equals <code>Sanitize</code>, sets compile and link options for <code>-fsanitize=address,undefined</code> plus <code>-O1 -g -fno-omit-frame-pointer</code>. Make sure both compile ''and'' link options include the sanitizer flag - if you forget the link side, you will get cryptic linker errors about missing <code>__asan_*</code> symbols. Test locally: <code>cmake -B build-san -DCMAKE_BUILD_TYPE=Sanitize</code>, then build, then run a test binary directly to confirm it produces output that mentions ASan. | ||
| − | |||
# '''Run your existing tests under the sanitizers and clean up.''' Run <code>ctest --test-dir build-san --output-on-failure</code>. If any tests fail due to sanitizer reports, fix the underlying bugs in the Oven Controller code (do '''not''' suppress them). Typical findings in EE-trained codebases: forgotten <code>delete[]</code>, signed overflow in temperature math, reading from a member variable that was never initialized in a constructor. Each fix should be its own small commit with a message describing what the sanitizer reported. | # '''Run your existing tests under the sanitizers and clean up.''' Run <code>ctest --test-dir build-san --output-on-failure</code>. If any tests fail due to sanitizer reports, fix the underlying bugs in the Oven Controller code (do '''not''' suppress them). Typical findings in EE-trained codebases: forgotten <code>delete[]</code>, signed overflow in temperature math, reading from a member variable that was never initialized in a constructor. Each fix should be its own small commit with a message describing what the sanitizer reported. | ||
| − | |||
# '''Add a sanitize stage to GitLab CI.''' Edit <code>.gitlab-ci.yml</code> to add a new <code>sanitize</code> stage after <code>test</code>. The job should: configure a <code>Sanitize</code> build, compile it, and run <code>ctest</code> against it. Use your existing Docker image. The job must fail the pipeline on any sanitizer error. | # '''Add a sanitize stage to GitLab CI.''' Edit <code>.gitlab-ci.yml</code> to add a new <code>sanitize</code> stage after <code>test</code>. The job should: configure a <code>Sanitize</code> build, compile it, and run <code>ctest</code> against it. Use your existing Docker image. The job must fail the pipeline on any sanitizer error. | ||
| − | |||
# '''Trigger ASan on purpose.''' On your feature branch, introduce a deliberate '''heap-buffer-overflow''' bug somewhere in the production code. Push, watch the <code>sanitize</code> stage fail, capture the ASan report (the job-log permalink in the MR description is sufficient), then revert. Your goal is to see the full ASan diagnostic - the line with the offending access, the line of the allocation, and the "stack-buffer-overflow" or "heap-buffer-overflow" classification. | # '''Trigger ASan on purpose.''' On your feature branch, introduce a deliberate '''heap-buffer-overflow''' bug somewhere in the production code. Push, watch the <code>sanitize</code> stage fail, capture the ASan report (the job-log permalink in the MR description is sufficient), then revert. Your goal is to see the full ASan diagnostic - the line with the offending access, the line of the allocation, and the "stack-buffer-overflow" or "heap-buffer-overflow" classification. | ||
| − | |||
# '''Trigger UBSan on purpose.''' Repeat with a deliberate '''signed integer overflow''' or '''null pointer dereference'''. The UBSan report looks different from ASan's - a one-line "runtime error" message at the trigger point. Capture, revert. | # '''Trigger UBSan on purpose.''' Repeat with a deliberate '''signed integer overflow''' or '''null pointer dereference'''. The UBSan report looks different from ASan's - a one-line "runtime error" message at the trigger point. Capture, revert. | ||
| − | |||
# '''Trigger LeakSanitizer on purpose.''' Repeat with a deliberate memory leak (allocate something on the heap, do not free it). LeakSan reports at program exit, not at the leak site. Capture, revert. | # '''Trigger LeakSanitizer on purpose.''' Repeat with a deliberate memory leak (allocate something on the heap, do not free it). LeakSan reports at program exit, not at the leak site. Capture, revert. | ||
| − | |||
# '''Run Valgrind once, manually.''' Build your project in a normal Debug configuration (no sanitizers - they conflict with Valgrind's instrumentation). From inside your Docker container, run <code>valgrind --leak-check=full --show-leakkinds=all --track-origins=yes ./your_test_binary</code>. Save the full output. If Valgrind reports any errors that ASan did not, fix them. Add a paragraph to your MR description noting which tool caught what, or that the two agreed completely. | # '''Run Valgrind once, manually.''' Build your project in a normal Debug configuration (no sanitizers - they conflict with Valgrind's instrumentation). From inside your Docker container, run <code>valgrind --leak-check=full --show-leakkinds=all --track-origins=yes ./your_test_binary</code>. Save the full output. If Valgrind reports any errors that ASan did not, fix them. Add a paragraph to your MR description noting which tool caught what, or that the two agreed completely. | ||
| − | |||
# '''Re-check coverage.''' Recall from Week 4: <code>gcovr -r .. --html-details</code> against a coverage-instrumented build. Generate a current coverage report. Compare against your Week 4 baseline. If coverage dropped, write enough additional tests to bring it back. Coverage gaps are places where sanitizer findings are silently impossible, so this is part of the dynamic-analysis story even if the tool is the same one you used in Week 4. | # '''Re-check coverage.''' Recall from Week 4: <code>gcovr -r .. --html-details</code> against a coverage-instrumented build. Generate a current coverage report. Compare against your Week 4 baseline. If coverage dropped, write enough additional tests to bring it back. Coverage gaps are places where sanitizer findings are silently impossible, so this is part of the dynamic-analysis story even if the tool is the same one you used in Week 4. | ||
| − | |||
# '''Document the cost.''' Compare the wall-clock time of your full pipeline before and after this lab. The <code>sanitize</code> stage adds work; document by how much in your MR description. As in Week 7, "the sanitize stage adds 6 minutes and we caught two real bugs" is a more honest finding than rounding to "negligible". | # '''Document the cost.''' Compare the wall-clock time of your full pipeline before and after this lab. The <code>sanitize</code> stage adds work; document by how much in your MR description. As in Week 7, "the sanitize stage adds 6 minutes and we caught two real bugs" is a more honest finding than rounding to "negligible". | ||
| − | |||
# '''Review and merge.''' Open the MR. The reviewer should check: the Sanitize build type works locally and in CI, that the three deliberate-bug pipeline runs are visible, that the Valgrind output is referenced in the MR, that coverage has not regressed. | # '''Review and merge.''' Open the MR. The reviewer should check: the Sanitize build type works locally and in CI, that the three deliberate-bug pipeline runs are visible, that the Valgrind output is referenced in the MR, that coverage has not regressed. | ||
Versiunea curentă din 18 mai 2026 15:38
Week 9 Lab Activity: Dynamic Analysis with Sanitizers and Valgrind
Objective
Extend your Oven Controller's CI pipeline with a fourth quality gate: a sanitize stage that builds the project under AddressSanitizer and UndefinedBehaviorSanitizer and runs your full test suite against the instrumented binary. By the end of this lab, your pipeline catches a fourth class of bug - runtime memory errors and undefined behavior - automatically on every push.
Background
You will add a new CMake build type called Sanitize that compiles with -fsanitize=address,undefined alongside -O1 -g -fno-omit-frame-pointer. The -O1 keeps the optimizer light enough that sanitizer reports stay readable; -g embeds debug information so backtraces have line numbers; -fno-omit-frame-pointer ensures those backtraces are actually complete on x86_64.
The sanitizers report errors by printing to stderr and exiting with a nonzero status. This means your existing GoogleTest tests do not need any modification: if a test triggers a sanitizer error, the test binary aborts, ctest reports failure, the CI job fails, and the MR is blocked. The pipeline plumbing from Week 6 carries this work for you for free.
In addition to the sanitizer build, you will run Valgrind manually once against the same codebase. The two tools are complementary: ASan is fast and runs in CI; Valgrind is slow but sometimes catches things ASan misses (notably uninitialized reads, which require MemorySanitizer to catch with sanitizers - and you saw in the lecture why MSan is impractical).
You will also reuse gcovr from Week 4. Dynamic analysis can only find bugs on code paths your tests actually execute, so coverage is part of the deliverable.
The point of this lab is not to introduce new tools you have never seen - you have met all four (ASan/UBSan/Valgrind/gcovr) in lectures. The point is to wire them together into your existing pipeline and prove the wiring works by deliberately breaking things.
Documentation pointers: man valgrind, the AddressSanitizer wiki at https://github.com/google/sanitizers/wiki/AddressSanitizer, the UBSan documentation at https://clang.llvm.org/docs/UndefinedBehaviorSanitizer.html, and your existing gcovr docs.
Tasks
Work on a feature branch opened from a tracked Issue. Open a Merge Request when you are ready to integrate.
- Extend the Docker build environment. Update your
Dockerfileto installvalgrindalongside the tools already present. Rebuild and verify withvalgrind --version. The sanitizers themselves ship with the compiler, so no separate package is needed - but confirm by runningecho "int main(){}" | g++ -fsanitize=address -x c++ - -o /dev/nullinside the container; if that succeeds, ASan is available. - Add the Sanitize build type to CMake. In your
CMakeLists.txt, add logic that, whenCMAKE_BUILD_TYPEequalsSanitize, sets compile and link options for-fsanitize=address,undefinedplus-O1 -g -fno-omit-frame-pointer. Make sure both compile and link options include the sanitizer flag - if you forget the link side, you will get cryptic linker errors about missing__asan_*symbols. Test locally:cmake -B build-san -DCMAKE_BUILD_TYPE=Sanitize, then build, then run a test binary directly to confirm it produces output that mentions ASan. - Run your existing tests under the sanitizers and clean up. Run
ctest --test-dir build-san --output-on-failure. If any tests fail due to sanitizer reports, fix the underlying bugs in the Oven Controller code (do not suppress them). Typical findings in EE-trained codebases: forgottendelete[], signed overflow in temperature math, reading from a member variable that was never initialized in a constructor. Each fix should be its own small commit with a message describing what the sanitizer reported. - Add a sanitize stage to GitLab CI. Edit
.gitlab-ci.ymlto add a newsanitizestage aftertest. The job should: configure aSanitizebuild, compile it, and runctestagainst it. Use your existing Docker image. The job must fail the pipeline on any sanitizer error. - Trigger ASan on purpose. On your feature branch, introduce a deliberate heap-buffer-overflow bug somewhere in the production code. Push, watch the
sanitizestage fail, capture the ASan report (the job-log permalink in the MR description is sufficient), then revert. Your goal is to see the full ASan diagnostic - the line with the offending access, the line of the allocation, and the "stack-buffer-overflow" or "heap-buffer-overflow" classification. - Trigger UBSan on purpose. Repeat with a deliberate signed integer overflow or null pointer dereference. The UBSan report looks different from ASan's - a one-line "runtime error" message at the trigger point. Capture, revert.
- Trigger LeakSanitizer on purpose. Repeat with a deliberate memory leak (allocate something on the heap, do not free it). LeakSan reports at program exit, not at the leak site. Capture, revert.
- Run Valgrind once, manually. Build your project in a normal Debug configuration (no sanitizers - they conflict with Valgrind's instrumentation). From inside your Docker container, run
valgrind --leak-check=full --show-leakkinds=all --track-origins=yes ./your_test_binary. Save the full output. If Valgrind reports any errors that ASan did not, fix them. Add a paragraph to your MR description noting which tool caught what, or that the two agreed completely. - Re-check coverage. Recall from Week 4:
gcovr -r .. --html-detailsagainst a coverage-instrumented build. Generate a current coverage report. Compare against your Week 4 baseline. If coverage dropped, write enough additional tests to bring it back. Coverage gaps are places where sanitizer findings are silently impossible, so this is part of the dynamic-analysis story even if the tool is the same one you used in Week 4. - Document the cost. Compare the wall-clock time of your full pipeline before and after this lab. The
sanitizestage adds work; document by how much in your MR description. As in Week 7, "the sanitize stage adds 6 minutes and we caught two real bugs" is a more honest finding than rounding to "negligible". - Review and merge. Open the MR. The reviewer should check: the Sanitize build type works locally and in CI, that the three deliberate-bug pipeline runs are visible, that the Valgrind output is referenced in the MR, that coverage has not regressed.
Stretch tasks (optional, if you finish early)
Pick at most one. None earn extra points; all teach you something useful.
- Add a second sanitizer build. Some teams maintain two CI jobs: one with ASan+UBSan, one with ThreadSanitizer. Add a TSan job that builds with
-fsanitize=threadand runs the tests. Even on single-threaded code this is harmless, and it sets you up for the day you add concurrency.
- Valgrind in CI. Add a manual-trigger CI job (using GitLab's
when: manual) that runs Valgrind on your test binary. It is too slow to run on every push, but a one-click button for the weekly deep audit is valuable.
- Coverage threshold gate. Use
gcovr --fail-under-line Nto fail the pipeline if line coverage drops below a threshold you choose (e.g. 80%). Now coverage is also an automatic quality gate, not just a number.
- Sanitizer-specific environment variables. Read the docs for
ASAN_OPTIONSandUBSAN_OPTIONS. Addhalt_on_error=1andabort_on_error=1to your CI environment, which makes the first error a hard stop instead of letting the program limp on. Useful for getting clearer reports.
Deliverables
For this lab to be considered complete, your main branch must contain:
- A
Dockerfilethat installsvalgrindin addition to the previous week's tooling. - A
CMakeLists.txtwith a workingSanitizebuild type. - A
.gitlab-ci.ymlwith asanitizestage running after theteststage. - A merged Merge Request whose linked Issue (or MR description) contains:
- References to three failed pipeline runs from Tasks 5, 6, and 7 (one each for ASan, UBSan, and LeakSan), with the relevant snippet of each sanitizer's report quoted briefly.
- The Valgrind output from Task 8, or a permalink to it (a job artifact or a gist).
- A note comparing Valgrind's findings with the sanitizers' findings (Task 8).
- The coverage comparison from Task 9 (before/after, plus what you did about it).
- The pipeline cost comparison from Task 10.
The VPL deliverables include the first three files which will be the only ones graded.
Common pitfalls
- Linker errors mentioning
__asan_*or__ubsan_*symbols. You set the sanitizer flag for compilation but forgot to also pass it at the link step. In CMake, bothadd_compile_optionsandadd_link_optionsneed-fsanitize=....
- Sanitizers and Valgrind conflict. You cannot run a sanitizer-built binary under Valgrind. They both rely on intercepting memory accesses and the interception clashes. Build a separate Debug binary (no sanitizers) when you want to run Valgrind.
- ASan reports look truncated or have
??in the backtrace. Almost always means missing-gor missing-fno-omit-frame-pointer. Without frame pointers, the unwinder on x86_64 falls back to imperfect heuristics.
- The pipeline passes locally but ASan errors only appear in CI. Check that your container image is using a recent enough compiler. GCC's ASan got significantly better between 8.x and 11.x; UBSan added several checks in the same span. The Debian Bookworm
g++(12.x) is fine.
- Tests pass under the sanitizers locally, but a teammate's machine shows different errors. ASan and UBSan are deterministic for a given binary, but slight platform differences (libc version, glibc malloc settings) can occasionally surface bugs that are real but only fire on one machine. The CI container is the source of truth.
- Valgrind reports complaints in the standard library. Usually you can ignore these unless they reference your own code. Use
--suppressions=/usr/share/valgrind/default.supp(most distributions ship a default suppression file) to filter out the well-known libc and libstdc++ noise.
- ASan says "AddressSanitizer: DEADLYSIGNAL" with no obvious cause. Your program received a fatal signal (often SIGSEGV) inside ASan's own instrumentation. This usually means very serious memory corruption or stack overflow. Use
ASAN_OPTIONS=abort_on_error=1to get a core dump and a clearer backtrace.
- Some sanitizer findings disappear at
-O2or-O3. That is expected - the optimizer can eliminate the offending construct entirely. Keep your sanitizer build at-O1: optimized enough to be realistic, unoptimized enough to keep reports trustworthy.
Looking ahead
Next week: Documentation as Code. You will add a Doxygen build to your CMake, publish the generated HTML docs as a GitLab Pages artifact, and turn your scattered code comments into navigable API documentation. Bring a piece of your capstone code that is poorly documented and you want to improve.