simet.services.precision_recall¶
simet.services.precision_recall ¶
PrecisionRecallService ¶
Utilities for feature normalization and batched k-NN search (FAISS).
batched_search
staticmethod
¶
batched_search(index, queries, k, batch_size, return_indices=False)
Run a k-NN search against a FAISS index with optional batching.
Wraps index.search(queries, k) and allows chunking the query set to
control memory usage. Returns distances by default, or indices
if return_indices=True.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
index
|
Index
|
A FAISS index (e.g., |
required |
queries
|
ndarray
|
2D array of query vectors, shape |
required |
k
|
int
|
Number of nearest neighbors to retrieve per query. |
required |
batch_size
|
int | None
|
If |
required |
return_indices
|
bool
|
If |
False
|
Returns:
| Type | Description |
|---|---|
ndarray
|
np.ndarray:
- If |
Notes
- No normalization is applied here. If needed, call
safe_normon your database and query vectors beforehand (and use a cosine-compatible index). - For FAISS IVF/HNSW indexes, performance/accuracy also depends on probe
settings (
nprobe, etc.), which you should configure onindexprior to calling this function.
Source code in simet/services/precision_recall.py
26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 | |
safe_norm
staticmethod
¶
safe_norm(x)
In-place L2 normalization of row vectors with zero-safe clipping.
Normalizes each row of x to unit L2 norm: x[i] /= max(||x[i]||_2, 1e-12).
Operates in place and returns None.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
x
|
ndarray
|
2D array of shape |
required |
Notes
- Uses an epsilon
1e-12to avoid division by zero for all-zero rows. - The dtype is preserved; if you need
float32, cast before calling.
Source code in simet/services/precision_recall.py
8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 | |