5ba0a12a6-mac

total: 57, pass: 51, fail: 6

fix plain text reader

Now it rejects to process text files that are not utf-8. It's because

1. Now that ragit continues processing files even if there's an
   erroneous file, it's okay to throw more errors. It'll not bother the
   users.
2. Plain text reader is the default file reader. If a user mistakenly
   adds a random file, which is likely to be a binary file, ragit will
   use the plain text reader. If it's using `String::from_utf8_lossy`,
   it'll generate a chunk with tons of REPLACEMENT_CHARACTERs, which is
   total waste of time and energy.

commit: 5ba0a12a6e9894e2a91b01c89cc324c81530d14b
platform: macOS-14.3.1-arm64-arm-64bit
ragit version: ragit 0.4.0-dev
rustc version: rustc 1.89.0-nightly (777d37277 2025-05-17)
cargo version: cargo 1.89.0-nightly (47c911e9e 2025-05-14)
python version: 3.11.7
tested at: 2025-05-19T10:47:10.063082Z (26 days ago)
total elapsed time: 4,442,685 ms

cargo_tests
add_and_rm
add_and_rm2
ignore
recover
clone
clone_empty
pull
server
server_permission
cli
archive
many_chunks
many_jobs
ls
meta
symlink
ii
cat_file
generous_file_reader
images
markdown_reader
csv_reader
real_repos
subdir
tfidf
merge
external_bases
end_to_end dummy
end_to_end llama3.3-70b
audit llama3.3-70b
logs llama3.3-70b
prompts dummy
prompts gpt-4o-mini
prompts claude-3.5-sonnet
empty dummy
empty llama3.3-70b
server_chat llama3.3-70b
images2 gpt-4o-mini
images3 gpt-4o-mini
pdl gpt-4o-mini
pdf gpt-4o-mini
svg gpt-4o-mini
web_images gpt-4o-mini
images2 claude-3.5-sonnet
extract_keywords dummy
extract_keywords gpt-4o-mini
orphan_process llama3.3-70b
write_lock llama3.3-70b
ragit_api command-r
query_options llama3.3-70b
query_with_schema llama3.3-70b
models_init
test_home_config_override
migrate
migrate2
config

Cases

cargo_tests

elapsed time: 486,116 ms

history

add_and_rm

elapsed time: 139,760 ms

history

add_and_rm2

elapsed time: 46,391 ms

history

ignore

elapsed time: 14,760 ms

history

recover

elapsed time: 12,524 ms

history

clone

elapsed time: 136,480 ms

history

clone_empty

elapsed time: 10,888 ms

history

pull

elapsed time: 17,191 ms

history

server

elapsed time: 9,773 ms

Error

{'ragit_version': '0.4.0-dev', 'chunk_count': 0, 'staged_files': [], 'processed_files': {}, 'curr_processing_file': None, 'repo_url': None, 'ii_status': {'type': 'None'}, 'uid': None} != {'ragit_version': '0.4.0-dev', 'chunk_count': 0, 'staged_files': [], 'processed_files': {}, 'curr_processing_file': None, 'repo_url': None, 'ii_status': {'type': 'None'}, 'uid': {'high': 226965926617079404232257257017206136310, 'low': 98353210059702837338669213515912314881}}
Traceback (most recent call last):
  File "/Users/baehyunsol/Documents/Rust/ragit/tests/tests.py", line 630, in <module>
    test()
  File "/Users/baehyunsol/Documents/Rust/ragit/tests/server.py", line 51, in server
    assert_eq_json("index.json", index_json)
  File "/Users/baehyunsol/Documents/Rust/ragit/tests/server.py", line 266, in assert_eq_json
    raise ValueError(f"{file.__repr__()} != {value.__repr__()}")
ValueError: {'ragit_version': '0.4.0-dev', 'chunk_count': 0, 'staged_files': [], 'processed_files': {}, 'curr_processing_file': None, 'repo_url': None, 'ii_status': {'type': 'None'}, 'uid': None} != {'ragit_version': '0.4.0-dev', 'chunk_count': 0, 'staged_files': [], 'processed_files': {}, 'curr_processing_file': None, 'repo_url': None, 'ii_status': {'type': 'None'}, 'uid': {'high': 226965926617079404232257257017206136310, 'low': 98353210059702837338669213515912314881}}

history

server_permission

elapsed time: 2,529 ms

Error


Traceback (most recent call last):
  File "/Users/baehyunsol/Documents/Rust/ragit/tests/tests.py", line 630, in <module>
    test()
  File "/Users/baehyunsol/Documents/Rust/ragit/tests/server_permission.py", line 23, in server_permission
    create_user(id="test-user-2", email="sample2@email.com", password="abcdefgh")
  File "/Users/baehyunsol/Documents/Rust/ragit/tests/server.py", line 186, in create_user
    assert response.status_code == 200
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^
AssertionError

history

cli

elapsed time: 25,360 ms

history

meta

elapsed time: 8,005 ms

history

symlink

elapsed time: 10,789 ms

history

elapsed time: 137,816 ms

Error

tfidf result on term 'search gpg annot select correspond' is not close enough. error: `approximation[2] not in answer`, answer: ['3e0d93ece16c10490435c08b7b755db9a57e53b818a9e62c0000000100000fa3', 'c5719c769542cb0cde49558784948082703f2da9618c29d80000000100000fb3', '6d1b2eeef26e5ce9672e62a7ca43412c66b86ad0e48d27620000000100000fa0', '606389435f969a017ad1cf63a7a30eba0d1a08c743efea9f0000000100000318', 'f386d96798aad5baf548b6985b367932bdc89483b756b515000000010000081f', 'c66345d5ab119b4cf05a6899472b54a4fd0041ee2b83b9f80000000100000fa2', 'bf8735875031f53ccd50e48e6674d9ac64c90f68bb0c7edb0000000100000fa0', '509b4b369f9f9729365a6947ce43335209d934562feeb7220000000100000fa2', '82ad9747a31109a3ef965e4168a0968cb56a448390416e290000000100000bf5', 'b632241f25a98c9320097079669e1acd10afd534e67ec2600000000100000fa2'], approximation: ['3e0d93ece16c10490435c08b7b755db9a57e53b818a9e62c0000000100000fa3', 'b632241f25a98c9320097079669e1acd10afd534e67ec2600000000100000fa2', '90a25e1efdafffab6369490140eecabb90ab0649108feeff0000000100000cd4', 'bf8735875031f53ccd50e48e6674d9ac64c90f68bb0c7edb0000000100000fa0', '5cdbfe828a4a84a4129bda3cc32bb8376914275561fa6a1a0000000100000da8', 'c5719c769542cb0cde49558784948082703f2da9618c29d80000000100000fb3', '0833e100c47da17ca6a2d202310483ed3c08f75ec2cfbf4a0000000100000c67', '1ff3d753fa4b857385f748c5d02a7371332241a8579211f9000000010000075c', '6f305111c4ab2bb2243ce34889afb4f72dff498303da56890000000100000c1e', '6d1b2eeef26e5ce9672e62a7ca43412c66b86ad0e48d27620000000100000fa0']
Traceback (most recent call last):
  File "/Users/baehyunsol/Documents/Rust/ragit/tests/ii.py", line 103, in ii_worker
    raise AssertionError(f"approximation[{i}] not in answer")
AssertionError: approximation[2] not in answer

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/Users/baehyunsol/Documents/Rust/ragit/tests/tests.py", line 630, in <module>
    test()
  File "/Users/baehyunsol/Documents/Rust/ragit/tests/ii.py", line 49, in ii
    ii_worker()
  File "/Users/baehyunsol/Documents/Rust/ragit/tests/ii.py", line 116, in ii_worker
    raise AssertionError(f"tfidf result on term '{term}' is not close enough. error: `{e}`, answer: {answer}, approximation: {approximation}")
AssertionError: tfidf result on term 'search gpg annot select correspond' is not close enough. error: `approximation[2] not in answer`, answer: ['3e0d93ece16c10490435c08b7b755db9a57e53b818a9e62c0000000100000fa3', 'c5719c769542cb0cde49558784948082703f2da9618c29d80000000100000fb3', '6d1b2eeef26e5ce9672e62a7ca43412c66b86ad0e48d27620000000100000fa0', '606389435f969a017ad1cf63a7a30eba0d1a08c743efea9f0000000100000318', 'f386d96798aad5baf548b6985b367932bdc89483b756b515000000010000081f', 'c66345d5ab119b4cf05a6899472b54a4fd0041ee2b83b9f80000000100000fa2', 'bf8735875031f53ccd50e48e6674d9ac64c90f68bb0c7edb0000000100000fa0', '509b4b369f9f9729365a6947ce43335209d934562feeb7220000000100000fa2', '82ad9747a31109a3ef965e4168a0968cb56a448390416e290000000100000bf5', 'b632241f25a98c9320097079669e1acd10afd534e67ec2600000000100000fa2'], approximation: ['3e0d93ece16c10490435c08b7b755db9a57e53b818a9e62c0000000100000fa3', 'b632241f25a98c9320097079669e1acd10afd534e67ec2600000000100000fa2', '90a25e1efdafffab6369490140eecabb90ab0649108feeff0000000100000cd4', 'bf8735875031f53ccd50e48e6674d9ac64c90f68bb0c7edb0000000100000fa0', '5cdbfe828a4a84a4129bda3cc32bb8376914275561fa6a1a0000000100000da8', 'c5719c769542cb0cde49558784948082703f2da9618c29d80000000100000fb3', '0833e100c47da17ca6a2d202310483ed3c08f75ec2cfbf4a0000000100000c67', '1ff3d753fa4b857385f748c5d02a7371332241a8579211f9000000010000075c', '6f305111c4ab2bb2243ce34889afb4f72dff498303da56890000000100000c1e', '6d1b2eeef26e5ce9672e62a7ca43412c66b86ad0e48d27620000000100000fa0']

history

cat_file

elapsed time: 37,604 ms

history

generous_file_reader

elapsed time: 32,073 ms

history

images

elapsed time: 10,952 ms

history

markdown_reader

elapsed time: 15,309 ms

history

csv_reader

elapsed time: 9,131 ms

history

real_repos

elapsed time: 311,819 ms

Error

Command '['cargo', 'run', '--release', '--', 'build']' returned non-zero exit status 1.
Traceback (most recent call last):
  File "/Users/baehyunsol/Documents/Rust/ragit/tests/tests.py", line 630, in <module>
    test()
  File "/Users/baehyunsol/Documents/Rust/ragit/tests/real_repos.py", line 63, in real_repos
    cargo_run(["build"])
  File "/Users/baehyunsol/Documents/Rust/ragit/tests/utils.py", line 70, in cargo_run
    result = subprocess.run(args, **kwargs)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/anaconda3/lib/python3.11/subprocess.py", line 571, in run
    raise CalledProcessError(retcode, process.args,
subprocess.CalledProcessError: Command '['cargo', 'run', '--release', '--', 'build']' returned non-zero exit status 1.

history

subdir

elapsed time: 28,669 ms

history

tfidf

elapsed time: 32,065 ms

history

merge

elapsed time: 52,132 ms

history

external_bases

elapsed time: 52,524 ms

history

end_to_end dummy

elapsed time: 81,439 ms

history

end_to_end llama3.3-70b

elapsed time: 99,604 ms

history

audit llama3.3-70b

elapsed time: 15,851 ms

history

logs llama3.3-70b

elapsed time: 7,773 ms

history

prompts dummy

elapsed time: 9,675 ms

history

prompts gpt-4o-mini

elapsed time: 49,523 ms

history

prompts claude-3.5-sonnet

elapsed time: 74,632 ms

history

empty dummy

elapsed time: 10,979 ms

history

empty llama3.3-70b

elapsed time: 11,634 ms

history

server_chat llama3.3-70b

elapsed time: 16,215 ms

Error

Expecting value: line 1 column 1 (char 0)
Traceback (most recent call last):
  File "/opt/anaconda3/lib/python3.11/site-packages/requests/models.py", line 974, in json
    return complexjson.loads(self.text, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/anaconda3/lib/python3.11/json/__init__.py", line 346, in loads
    return _default_decoder.decode(s)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/anaconda3/lib/python3.11/json/decoder.py", line 337, in decode
    obj, end = self.raw_decode(s, idx=_w(s, 0).end())
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/anaconda3/lib/python3.11/json/decoder.py", line 355, in raw_decode
    raise JSONDecodeError("Expecting value", s, err.value) from None
json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0)

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/Users/baehyunsol/Documents/Rust/ragit/tests/tests.py", line 630, in <module>
    test()
  File "/Users/baehyunsol/Documents/Rust/ragit/tests/tests.py", line 578, in <lambda>
    ("server_chat llama3.3-70b", lambda: server_chat(test_model="llama3.3-70b")),
                                         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/baehyunsol/Documents/Rust/ragit/tests/server_chat.py", line 67, in server_chat
    responses2.append(requests.post(f"http://127.0.0.1:41127/test-user/sample2/chat/{chat_id2}", files={"query": "How does the rust compiler implement type system?"}).json())
                      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/anaconda3/lib/python3.11/site-packages/requests/models.py", line 978, in json
    raise RequestsJSONDecodeError(e.msg, e.doc, e.pos)
requests.exceptions.JSONDecodeError: Expecting value: line 1 column 1 (char 0)

history

images2 gpt-4o-mini

elapsed time: 13,375 ms

history

images3 gpt-4o-mini

elapsed time: 12,642 ms

history

pdl gpt-4o-mini

elapsed time: 8,073 ms

history

pdf gpt-4o-mini

elapsed time: 145,263 ms

history

svg gpt-4o-mini

elapsed time: 18,883 ms

Error


Traceback (most recent call last):
  File "/Users/baehyunsol/Documents/Rust/ragit/tests/tests.py", line 630, in <module>
    test()
  File "/Users/baehyunsol/Documents/Rust/ragit/tests/tests.py", line 583, in <lambda>
    ("svg gpt-4o-mini", lambda: svg(test_model="gpt-4o-mini")),
                                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/baehyunsol/Documents/Rust/ragit/tests/svg.py", line 123, in svg
    assert "ragit" in cargo_run(["pdl", "test1.pdl"], stdout=True).lower()
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
AssertionError

history

web_images gpt-4o-mini

elapsed time: 40,598 ms

history

images2 claude-3.5-sonnet

elapsed time: 15,367 ms

history

extract_keywords dummy

elapsed time: 3,174 ms

history

extract_keywords gpt-4o-mini

elapsed time: 10,871 ms