Listwise Learning to Rank (LTR) optimizes the entire ranking order for a given query—unlike Pointwise or Pairwise approaches. It's especially effective for ranking lab tests by relevance to queries like:
"glucose in blood", "bilirubin in plasma", "white blood cells count"
Listwise LTR models learn a ranking function that optimizes evaluation metrics such as NDCG (Normalized Discounted Cumulative Gain).
- Input: A list of lab tests (documents) for a given query.
- Scoring Function: A model predicts a relevance score per test.
- Loss Function:
- eXtreme NDCG – a direct optimization of NDCG.
- LambdaRank – also NDCG-focused.
- Output: A ranked list of lab tests based on predicted relevance.
We calculate relevance scores for lab tests by computen two different scoring procedures.
Traditional scoring is based on direct keyword matching between the query and dataset fields. This method prioritizes exact and partial string matches in key attributes such as the component and system.
- Component: Substance measured (e.g., Glucose)
- System: Environment of measurement (e.g., Blood, Serum/Plasma)
Each lab test includes:
- Component
- System
- Exact Match: Full match with the query term.
- Partial Match: Synonyms or semantically similar terms.
- Exact Match (Component) = weight(component) * weight(component)
- Partial Match (Component) = weight(component)/2 * weight(component)
- Exact Match (System) = weight(system) * weight(system)
- Partial Match (System) = weight(system)/2 * weight(system) No Match = 0
This method uses sentence embeddings to measure the semantic similarity between the query and each field in the dataset.
- Encode the query string into a vector using a pre-trained embedding model.
- Each text field (e.g., component, system, etc.) is encoded into a vector representation.
-
Use cosine similarity to compare the query vector and each field’s embedding:
similarity = cosine_similarity([query_embedding], [cell_embedding])[0][0]
-
Normalize similarity score from [-1, 1] to [0, 1]:
normalized_score = ((similarity + 1) / 2)
-
Final embedding score for a field:
embedding_score = normalized_score * 5 * weight(field)
-
Aggregate across all eligible text fields.
total_score = traditional_score + embedding_score
Normalize scores between 0 and 1 using:
- Normalized Score = score / max_score
Save the processed data and scores into a new CSV file for model training.
We use LightGBM due to its speed, simplicity, and support for listwise ranking.
- Load data from CSV.
- Encode categorical columns:
Query
,Name
,Component
,System
,Property
,Measurement
. - Create
Score_label
fromNormalized_Score
. - Split into train and test sets.
- Features: Encoded columns.
- Grouping: Group by
Query
(listwise requirement). - Labels: Use
Score_label
.
- Objective:
rank_xendcg
- Approach: Simulate AdaRank-style boosting and reweighting using LightGBM parameters.
- Predict and normalize scores.
- Sort by
Query
andPredicted Score
. - Save results to
results.csv
.
To improve NDCG, we introduced new features, expanded queries, and added more data.
Added queries beyond the original three:
calcium in serum
cells in urine
...including query variations likecalcium
,urine
,cells
, etc.
We queried LOINC Search for additional documents:
- bilirubin in plasma / bilirubin
- calcium in serum / calcium
- glucose in blood / glucose
- leukocytes / white blood cells count
- blood / urine / cells
Saved results as CSVs.
We use multiple metrics to assess model performance:
Metric | Description | Ideal Value |
---|---|---|
MSE | Mean Squared Error – lower is better | 0 |
R² | R-squared – explains variance, higher is better | 1 |
Spearman's ρ | Rank correlation – higher shows stronger ranking match | 1 |
NDCG | Normalized DCG – higher is better ranking quality | 1 |
Dataset | MSE | R² | Spearman ρ | NDCG | Notes |
---|---|---|---|---|---|
Basic | 0.1642 | -2.5187 | 0.7265 | 0.9086 | Initial 3 queries |
First Enhanced | 0.0479 | -1.9010 | 0.4700 | 0.8533 | Added calcium in serum |
Second Enhanced | 0.0461 | -0.8984 | 0.6024 | 0.9421 | Added bilirubin , glucose , leukocytes |
Third Enhanced | 0.0252 | -0.4765 | 0.4983 | 0.9398 | Added blood , serum or plasma |
Fourth Enhanced | 0.0450 | -1.4383 | 0.4323 | 0.9448 | Added cells in urine |
Fifth Enhanced | 0.0191 | -0.6009 | 0.4615 | 0.9517 | Final version with cells , urine |
- bilirubin in plasma: 0.9499
- calcium in serum: 0.9637
- cells in urine: 0.9448
- glucose in blood: 0.9663
- white blood cells count: 0.9339