A Retrospect On New Test Conversions

A Few weeks have passed and additional test files have been converted to use the clar testing framework. Only the reftable related tests are waiting to be converted like their clar counterparts. In this blog post, I want to discuss the conversion process of the following test files went, the problems faced, and how they were mitigated;

t-oidmap.c
t-oidtree.c
t-oidarray.c
t-trailer.c
t-urlmatch-normalization

Pre-conversion stage

The test conversions kicked off with the oid (object id) related test file. In the previous structure, all oid related test suites were dependent on the two functions; get_oid_arbitrary_hex() and init_hash_algo(). init_hash_algo() determines which hash algorithm Git should use, either from an environment variable or a default (SHA-1). get_oid_arbitrary_hex() converts a hexadecimal string into an object ID using the selected hash algorithm. If the algorithm is unknown, it returns an error. There was a need to create a clar-based equivalent to fit the conversions of the dependent test file to use the clar testing framework.

The above helper functions were modified and enhanced to not only suit the new structure but also perform clar-based assertions which ensures that a valid hash algorithm is selected, while checking that a hex string can be correctly converted into an object ID, making them more test-oriented. The resulting helper functions were cl_setup_hash_algo() and cl_parse_any_oid()

Now let’s break down and understand how the following test files work;

t-oid-array.c This code defines unit tests for handling object IDs (OIDs) in Git. It ensures that OIDs are correctly stored, sorted, and searched. The function t_enumeration() checks whether OIDs are properly processed and duplicates are removed, while t_lookup() verifies if an OID exists in an array and returns the correct index. The fill_array() function converts hex strings into OIDs and adds them to an array, and add_to_oid_array() appends an OID to an array. The setup() function initializes the hash algorithm, and cmd_main() runs all test cases. Tests are organized using macros like TEST_ENUMERATION and TEST_LOOKUP, ensuring Git correctly processes and manages OIDs.
t-oid-map.c This file tests the functionality of oidmap, a hash map for storing object IDs (OIDs) and their associated names. It ensures that OIDs can be inserted, retrieved, replaced, removed, and iterated over correctly. The setup() function initializes the map and inserts predefined key-value pairs. Individual test functions check if entries can be replaced (t_replace()), retrieved (t_get()), removed (t_remove()), and iterated over (t_iterate()). The cmd_main() function runs these tests, verifying correct behavior using assertions and debug messages.
t-oidtree.c This code tests the functionality of an oidtree, a data structure that stores and retrieves object IDs efficiently. It defines helper functions for inserting fill_tree_loc and checking for object existence check_contains. The check_each function verifies that iterating over stored object IDs returns expected results using a callback mechanism.

The test functions t_contains and t_each validate key behaviors of oidtree. t_contains ensures that inserted object IDs can be correctly found or not found, while t_each checks that partial queries return the correct set of object IDs. The setup function initializes and clears the oidtree for each test, ensuring test isolation.

Finally, cmd_main runs the tests using TEST(setup(t_contains)) and TEST(setup(t_each)), confirming that oidtree insertion, lookup, and iteration work as expected.
t-trailer.c This file tests Git’s ability to parse trailers; metadata lines like Signed-off-by, Fixes, etc. from commit messages. It defines the function t_trailer_iterator, which initializes a trailer iterator, extracts trailer lines, and verifies that their raw text, keys, and values match the expected results. Another function, run_t_trailer_iterator, runs multiple test cases with different commit message formats, including cases with and without body text, multiple trailer blocks, and non-trailer lines within the trailer section. The tests also ensure that Git correctly prioritizes the last trailer block and ignores dividers like ---.
t-urlmatch-normalization.c This file contains unit tests to validate and normalize URLs using the url_normalize function. It ensures that different parts of a URL, such as the scheme, authority, and port, are correctly processed and normalized. The function check_url_normalizable verifies whether a given URL can be normalized, while check_normalized_url ensures that a URL is correctly transformed into an expected value. The compare_normalized_urls function checks if two different URLs normalize to the same result, and check_normalized_url_length validates the length of a normalized URL.

The test functions, such as t_url_scheme, t_url_authority, and t_url_port, focus on specific aspects of URL validation. They test various cases, including invalid schemes, missing hosts, malformed ports, and reserved characters. The test ensures that valid URLs are handled correctly while identifying and rejecting invalid ones, maintaining consistency in URL normalization.

Conversion stage

u-oid-array.c The new implementation replaces macros with dedicated test functions (test_oid_array__*), making tests easier to run and maintain. It also replaces get_oid_arbitrary_hex() with cl_parse_any_oid(), which provides better error handling. Assertions now use cl_assert_*, improving failure messages. It tests storing, sorting, and looking up object IDs (OIDs) in oid_array. The function fill_array() converts hex strings into OIDs, t_enumeration() ensures sorting and duplicate removal, and t_lookup() verifies OID existence.

Additionally, test_oid_array__initialize() handles hash algorithm setup separately, avoiding redundancy. By organizing tests into self-explanatory functions, this version follows modern testing best practices and structure, and it makes future modifications easier. This version improves modularity, readability, and integration with the Clar testing framework.
u-oid-map.c This improves upon the previous one by transitioning the custom test framework to a more standardized and structured testing design. Instead of using a setup() function for initialization and cleanup, it introduces dedicated functions: test_oidmap__initialize() and test_oidmap__cleanup(). This ensures better modularity and test isolation.

A key improvement is the use of a global oidmap instance, which eliminates the need to pass the map between functions, making the code simpler. Assertions have also been improved by using cl_assert() and cl_assert_equal_s(), which make the tests more readable and standardized. Additionally, failure reporting has been enhanced with cl_failf(), which provides clearer error messages in test_oidmap__iterate(), making debugging easier.

The test structure has been refined, with each test function; test_oidmap__replace(), test_oidmap__get(), test_oidmap__remove(), and test_oidmap__iterate(), clearly defined and isolated. This improves maintainability and reusability, making it more modular, readable, and aligned with best practices for unit testing.
u-oidtree.c This version improves upon the previous implementation in several key ways. The function fill_tree_loc has been simplified by directly calling cl_parse_any_oid(...) instead of using check_int(get_oid_arbitrary_hex(...)). Other improvement includes the replacement of check_* macros with cl_assert_* functions. These functions provide clearer assertion failures, making debugging easier. The update also introduces dedicated setup and teardown functions, test_oidtree__initialize() and test_oidtree__cleanup(), which help maintain a structured and organized test suite.

Furthermore, the function check_each_cb has been improved by replacing manual error messages with direct assertions using cl_assert_equal_s(...), simplifying the validation process and ensuring that mismatches are detected more efficiently.
u-trailer.c This update improves Git’s trailer parsing tests by breaking up previously combined test cases into separate, more granular tests. Instead of handling multiple scenarios within a single function, each test case now focuses on a specific aspect of trailer parsing, making it easier to understand, debug, and maintain.

The update retains all existing test cases but organizes them into distinct functions, ensuring a clearer separation of different trailer parsing behaviors. These include cases where commit messages contain no body text, multiple trailer blocks, trailers with non-trailer lines, and the effect of dividers (—). The refactoring also ensures that the last encountered trailer block takes precedence when multiple blocks are present.

By structuring the tests this way, the update improves test readability and makes debugging more efficient. Each assertion now clearly corresponds to a specific scenario, reducing ambiguity in failure reports.
u-urlmatch-normalization.c This encapsulates helper functions to abstract repetitive assertions, reducing duplication and making the code cleaner and easier to maintain. It also provides more comprehensive test coverage by systematically validating different aspects of URL normalization, such as scheme handling, authority validation, port handling, and port normalization.

The test cases are well-organized and grouped by specific URL components, improving readability and making it easier to debug failures. Additionally, the use of cl_assert_equal_i and cl_assert_equal_s ensures precise failure messages, allowing for more effective debugging. The suite also properly manages memory by freeing dynamically allocated memory, preventing potential memory leaks.

Difficulties

The major problem encountered was before I realized that the oid related tests are dependent on the functions I’ve discussed above. The updates tests kept failing as a result of the clar-based version not recognizing the functions because they weren’t wired up in clar. The initial solution employed was to adapt the functions in clar in our unit-test.{c,h} files, but I soon realized with the help of my mentors that this would mean that every other test file would also depend on those functions even though they have no use for them, and this is because all clar based tests are wired through unit-test.{c,h}, making it mandatory to add it as a header file.

Next Steps

The next thing on my agenda is to tackle the reftable test files which would mark the final batch of test conversions to clar, whew!