Reranker¶
Reorder authors using a custom score that considers factors like:
- Semantic similarity to the query
- Citation count / impact
- Publication recency (newer works are prioritized)
This is useful for highlighting authors who are both relevant and currently active or impactful.
High-level procedure¶
- Calculate the score in each type of
Resource
- Aggregate by Person
Define a Custom Formula in each resource¶
A general scoring formula for each resource can be expressed as:
$$ W_i = k_s \cdot s_i^{p_s} + k_c \cdot \log_{10}(c_i + 1) + \frac{k_r}{\log_{10}(y_c - y_i + m_r)} $$
where $$ \begin{aligned} W_i &\text{ — computed score of a single resource} \\ s_i &\text{ — semantic similarity between the resource and query} \\ c_i &\text{ — citation count for the resource} \\ y_i &\text{ — publication year of the resource} \\ y_c &\text{ — current year (used to measure recency)} \\ k_s, k_c, k_r &\text{ — scaling factors for similarity, citation, and recency} \\ p_s &\text{ — exponent to control the weight of similarity} \\ m_r &\text{ — margin to avoid division by zero and smooth the recency curve} \end{aligned} $$
We can define it in code with string like this¶
Allowed parameters:
- Safe operators:
log10
,sqrt
- Timing:
current_year
- Resource property names
formula = "distance**3 + log10(cited_by_count + 1) + 1/log10(current_year - publication_year + 3)"
This example:
- Rewards relevance →
distance**3
- Rewards high impact →
log10(cited_by_count+1)
- Rewards recency →
1/log10(current_year - publication_year + 3)
calculate_resource_score
example¶
from bear.search import SearchEngine
from bear.reranker import ResourceScoringConfig, calculate_resource_score
works = SearchEngine().search_resource("work", "machine learning", top_k=5)
2025-08-01 13:31:59,442 - httpx - INFO - HTTP Request: GET http://olvi-1:8000/info "HTTP/1.1 200 OK" 2025-08-01 13:31:59,625 - httpx - INFO - HTTP Request: POST http://olvi-1:8000/embeddings "HTTP/1.1 200 OK"
config = ResourceScoringConfig(
resource="work", formula="distance**3 + log10(cited_by_count + 1) + 1/log10(current_year - publication_year + 3)", min_distance=0.8, n_per_author=3
)
calculate_resource_score(works, config)
# return: {author_id: score, ...}
{'https://openalex.org/A5007205551': 2.7499951510763396, 'https://openalex.org/A5011335346': 2.816261857033572, 'https://openalex.org/A5015902472': 2.7499951510763396, 'https://openalex.org/A5027246402': 2.843364030106179, 'https://openalex.org/A5028372112': 2.7499951510763396, 'https://openalex.org/A5051336681': 3.889355412230575, 'https://openalex.org/A5052159611': 1.6747544852790681, 'https://openalex.org/A5065160332': 2.816261857033572, 'https://openalex.org/A5079166112': 3.889355412230575, 'https://openalex.org/A5088826068': 2.816261857033572, 'https://openalex.org/A5100731437': 2.7499951510763396, 'https://openalex.org/A5101618713': 3.889355412230575, 'https://openalex.org/A5102902731': 2.816261857033572}
Reranker
example¶
from bear.reranker import get_reranker
get_reranker("default").rerank({"work": works}) # system default reranker
[{'id': 'https://openalex.org/A5051336681', 'scores': {'total': 3.4879251880907427, 'work': 3.4879251880907427}}, {'id': 'https://openalex.org/A5079166112', 'scores': {'total': 3.4879251880907427, 'work': 3.4879251880907427}}, {'id': 'https://openalex.org/A5101618713', 'scores': {'total': 3.4879251880907427, 'work': 3.4879251880907427}}, {'id': 'https://openalex.org/A5027246402', 'scores': {'total': 2.5357178876045268, 'work': 2.5357178876045268}}, {'id': 'https://openalex.org/A5065160332', 'scores': {'total': 2.173139543205437, 'work': 2.173139543205437}}, {'id': 'https://openalex.org/A5011335346', 'scores': {'total': 2.173139543205437, 'work': 2.173139543205437}}, {'id': 'https://openalex.org/A5102902731', 'scores': {'total': 2.173139543205437, 'work': 2.173139543205437}}, {'id': 'https://openalex.org/A5088826068', 'scores': {'total': 2.173139543205437, 'work': 2.173139543205437}}, {'id': 'https://openalex.org/A5100731437', 'scores': {'total': 1.433595790381236, 'work': 1.433595790381236}}, {'id': 'https://openalex.org/A5007205551', 'scores': {'total': 1.433595790381236, 'work': 1.433595790381236}}, {'id': 'https://openalex.org/A5028372112', 'scores': {'total': 1.433595790381236, 'work': 1.433595790381236}}, {'id': 'https://openalex.org/A5015902472', 'scores': {'total': 1.433595790381236, 'work': 1.433595790381236}}, {'id': 'https://openalex.org/A5052159611', 'scores': {'total': 1.4064244663824488, 'work': 1.4064244663824488}}]
# Or in more complex system with multiple resources
# get_reranker("default").rerank({"work": works, "grants": grants, "patents": patents})