total: 63, pass: 58, fail: 5
add error handling to `tests/tests.py`
There's an edge case was clean up process of the test harness. When a
test dies, it removes tmp directories and changes directory to root. But
sometimes, a test case spawns children and dies while the children are
still running. The children are still writing something to tmp
directories and it would mess up `shutil.rmtree()`. So I added another
`try-catch` to the cleanup code, and added the error message to the
result.
The error messages from cleanup code are not visible from the ci
dashboard, but we can still see them when we open the result file.
elapsed time: 291,758 ms
elapsed time: 34,533 ms
elapsed time: 11,640 ms
elapsed time: 3,196 ms
elapsed time: 3,099 ms
elapsed time: 126,926 ms
elapsed time: 5,382 ms
elapsed time: 8,979 ms
elapsed time: 137,829 ms
elapsed time: 34,286 ms
elapsed time: 5,605 ms
elapsed time: 3,027 ms
elapsed time: 181,736 ms
elapsed time: 568,013 ms
elapsed time: 154,813 ms
elapsed time: 77,940 ms
elapsed time: 1,717 ms
elapsed time: 3,083 ms
elapsed time: 2,732 ms
elapsed time: 154,964 ms
tfidf result on term 'let bitxor' is not close enough. error: `answer[2] not in approximation`, answer: ['549a87567bd9ec4c3145df3f3db3c7f285e9b1551269c1720000000100000200', 'faae85ed9ecfe34ecc388f777a6adabbde32f4628b65cd820000000100000200', '445d453adb1948c8f9cce13eb8f0974869152c2f1e9e4b95000000010000016e', 'ddd47326a625b60b374ab71c97a87a8e3de207a73288da380000000100000204', '78973af7ecb7137f86b7e126889bdb947505308e35fad8b50000000100000200', '5b252d0f47ac8d0bee53b3c3c0387897869e9834d9e43edf000000010000020f', 'f28f8995ed5aad84ba253b191d1257e234e37f8f1ad572240000000100000202', '74a9821e38482554bd3e78b0f1c474a927848122a54223420000000100000212', '63cb62bb8a33a334f049b70be4cdd592d385215ade59690c0000000100000207', '5d6c16dc1991936cb1b2a3d821c43a8bf648d298a24992a70000000100000200'], approximation: ['549a87567bd9ec4c3145df3f3db3c7f285e9b1551269c1720000000100000200', 'faae85ed9ecfe34ecc388f777a6adabbde32f4628b65cd820000000100000200', '78973af7ecb7137f86b7e126889bdb947505308e35fad8b50000000100000200', '5b252d0f47ac8d0bee53b3c3c0387897869e9834d9e43edf000000010000020f', '74a9821e38482554bd3e78b0f1c474a927848122a54223420000000100000212', '63cb62bb8a33a334f049b70be4cdd592d385215ade59690c0000000100000207', 'd2178deab15f668fd917a5bc4fc39360e3c3d3d87a353a920000000100000200', 'b82a8fb6160d01b1c8a289e0af1b714cf92d89aaced6cbdb0000000100000202', 'b7f52a1abcdae26519257214f1837e3885856589649424170000000100000203', 'fa10425e1445e6bd8b59f7c047762787e114d2a8d4d1ef5b0000000100000200']
Traceback (most recent call last):
File "/home/baehyunsol/Documents/ragit/tests/ii.py", line 100, in ii_worker
raise AssertionError(f"answer[{i}] not in approximation")
AssertionError: answer[2] not in approximation
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/home/baehyunsol/Documents/ragit/tests/tests.py", line 672, in <module>
test()
File "/home/baehyunsol/Documents/ragit/tests/ii.py", line 36, in ii
ii_worker()
File "/home/baehyunsol/Documents/ragit/tests/ii.py", line 116, in ii_worker
raise AssertionError(f"tfidf result on term '{term}' is not close enough. error: `{e}`, answer: {answer}, approximation: {approximation}")
AssertionError: tfidf result on term 'let bitxor' is not close enough. error: `answer[2] not in approximation`, answer: ['549a87567bd9ec4c3145df3f3db3c7f285e9b1551269c1720000000100000200', 'faae85ed9ecfe34ecc388f777a6adabbde32f4628b65cd820000000100000200', '445d453adb1948c8f9cce13eb8f0974869152c2f1e9e4b95000000010000016e', 'ddd47326a625b60b374ab71c97a87a8e3de207a73288da380000000100000204', '78973af7ecb7137f86b7e126889bdb947505308e35fad8b50000000100000200', '5b252d0f47ac8d0bee53b3c3c0387897869e9834d9e43edf000000010000020f', 'f28f8995ed5aad84ba253b191d1257e234e37f8f1ad572240000000100000202', '74a9821e38482554bd3e78b0f1c474a927848122a54223420000000100000212', '63cb62bb8a33a334f049b70be4cdd592d385215ade59690c0000000100000207', '5d6c16dc1991936cb1b2a3d821c43a8bf648d298a24992a70000000100000200'], approximation: ['549a87567bd9ec4c3145df3f3db3c7f285e9b1551269c1720000000100000200', 'faae85ed9ecfe34ecc388f777a6adabbde32f4628b65cd820000000100000200', '78973af7ecb7137f86b7e126889bdb947505308e35fad8b50000000100000200', '5b252d0f47ac8d0bee53b3c3c0387897869e9834d9e43edf000000010000020f', '74a9821e38482554bd3e78b0f1c474a927848122a54223420000000100000212', '63cb62bb8a33a334f049b70be4cdd592d385215ade59690c0000000100000207', 'd2178deab15f668fd917a5bc4fc39360e3c3d3d87a353a920000000100000200', 'b82a8fb6160d01b1c8a289e0af1b714cf92d89aaced6cbdb0000000100000202', 'b7f52a1abcdae26519257214f1837e3885856589649424170000000100000203', 'fa10425e1445e6bd8b59f7c047762787e114d2a8d4d1ef5b0000000100000200']
elapsed time: 14,142 ms
elapsed time: 297,970 ms
elapsed time: 1,421 ms
elapsed time: 4,389 ms
elapsed time: 5,193 ms
elapsed time: 2,809 ms
elapsed time: 2,320,500 ms
Command '['cargo', 'run', '--release', '--', 'init']' returned non-zero exit status 101.
Traceback (most recent call last):
File "/home/baehyunsol/Documents/ragit/tests/tests.py", line 672, in <module>
test()
File "/home/baehyunsol/Documents/ragit/tests/real_repos.py", line 64, in real_repos
cargo_run(["init"])
File "/home/baehyunsol/Documents/ragit/tests/utils.py", line 74, in cargo_run
result = subprocess.run(args, **kwargs)
File "/usr/lib/python3.10/subprocess.py", line 526, in run
raise CalledProcessError(retcode, process.args,
subprocess.CalledProcessError: Command '['cargo', 'run', '--release', '--', 'init']' returned non-zero exit status 101.
elapsed time: 2,543 ms
Traceback (most recent call last):
File "/home/baehyunsol/Documents/ragit/tests/tests.py", line 672, in <module>
test()
File "/home/baehyunsol/Documents/ragit/tests/real_repos_regression.py", line 157, in real_repos_regression
assert count_files() == (len(reproductions), 1, len(reproductions) - 1) # (total, staged, processed)
AssertionError
elapsed time: 9,002 ms
elapsed time: 10,134 ms
elapsed time: 14,380 ms
elapsed time: 70,458 ms
elapsed time: 42,427 ms
elapsed time: 54,228 ms
elapsed time: 5,978 ms
elapsed time: 3,843 ms
elapsed time: 4,576 ms
elapsed time: 44,748 ms
elapsed time: 26,573 ms
elapsed time: 61,614 ms
elapsed time: 2,797 ms
elapsed time: 3,791 ms
elapsed time: 21,599 ms
elapsed time: 44,996 ms
elapsed time: 6,645 ms
elapsed time: 8,662 ms
elapsed time: 18,365 ms
elapsed time: 142,277 ms
Traceback (most recent call last):
File "/home/baehyunsol/Documents/ragit/tests/tests.py", line 672, in <module>
test()
File "/home/baehyunsol/Documents/ragit/tests/tests.py", line 622, in <lambda>
("pdf gpt-4o-mini", lambda: pdf(test_model="gpt-4o-mini")),
File "/home/baehyunsol/Documents/ragit/tests/pdf.py", line 51, in pdf
assert any([pdf["name"] in r["source"] for r in search_result])
AssertionError
elapsed time: 7,873 ms
Traceback (most recent call last):
File "/home/baehyunsol/Documents/ragit/tests/tests.py", line 672, in <module>
test()
File "/home/baehyunsol/Documents/ragit/tests/tests.py", line 623, in <lambda>
("svg gpt-4o-mini", lambda: svg(test_model="gpt-4o-mini")),
File "/home/baehyunsol/Documents/ragit/tests/svg.py", line 123, in svg
assert "ragit" in cargo_run(["pdl", "test1.pdl"], stdout=True).lower()
AssertionError
elapsed time: 62,873 ms
elapsed time: 9,756 ms
elapsed time: 1,632 ms
elapsed time: 9,060 ms
elapsed time: 99,776 ms
elapsed time: 83,057 ms
elapsed time: 1,049 ms
elapsed time: 5,559 ms
elapsed time: 1,624 ms
elapsed time: 283 ms
elapsed time: 134 ms
elapsed time: 32,730 ms
elapsed time: 143,731 ms
elapsed time: 43,784 ms