Signal scoring algorithm
Function: computeSignalScore() in lib/scoring/engine.ts
Each signal is processed in a fixed order. The order matters — later steps see the results of earlier ones.
1. Check zero-point conditions
→ if triggered: score = 0, skip all multipliers, stop here
2. Check penalties
→ spam: add spamPenalty (-12)
→ pr_close_no_merge: add prClosedNoMergePenalty (-10)
3. Bot activity check (zero-point)
→ author.type === 'Bot' OR login.endsWith('[bot]') → score = 0
4. [contributor only] Daily quota
→ if daily count for signal type > quota limit → score = 0
5. [contributor only] Diminishing returns
→ if weekly count > weeklyThreshold → apply decay formula
6. Apply multipliers (in order):
a. first_activity (1.5×)
b. merged_pr_commit (1.2×)
c. pr_linked_to_issue (1.1×)
7. For review signals: multiply by reviewStateWeights[review.state]
8. Final score = (base × multiplier_product × reviewStateWeight) − penaltiesSteps 4 and 5 only apply in contributor mode. Repository and team scoring use skipQuota: true.
Base points
| Signal type | Base points |
|---|---|
commit | 10 |
pr_merge | 50 |
review | 20 |
review_comment | 5 |
issue_open | 10 |
issue_close | 10 |
comment | 3 |
pr_open | 0 (tracking only) |
pr_close_no_merge | 0 (penalty only) |
spam | 0 (penalty only) |
Review state weights
Applied as a multiplier to review signals only.
| State | Weight |
|---|---|
approved | 1.25× |
changes_requested | 1.0× |
commented | 0.5× |
A review comment (review_comment) is a separate signal type and does not use these weights.
Multipliers
All multipliers stack multiplicatively. They are applied in the order listed.
| Kind | Factor | Applies to |
|---|---|---|
first_activity | 1.5× | commit, pr_merge, review, issue_open, issue_close, comment — once per type per scoring run |
merged_pr_commit | 1.2× | commit signals where metadata.isInMergedPR = true |
pr_linked_to_issue | 1.1× | pr_merge, commit signals where metadata.hasLinkedIssue = true |
first_activity only applies to the first signal of each type within a single scoring run. Repository scoring skips the first_activity multiplier entirely.
Penalties
| Kind | Amount |
|---|---|
spamPenalty | −12 |
prClosedNoMergePenalty | −10 |
Penalties are subtracted after the multiplier product is calculated.
Zero-point conditions
Each condition is independently toggleable in a preset. All are enabled by default.
| Kind | Condition |
|---|---|
self_review | User reviews their own PR |
self_merge | User merges their own PR |
bot_activity | author.type === 'Bot' OR login.endsWith('[bot]') |
issue_closed_no_pr | Issue closed with no linked PR (parsed from Closes/Fixes/Resolves keywords in PR bodies) |
pr_closed_no_merge | PR closed without merging — also triggers prClosedNoMergePenalty |
When a zero-point condition triggers, score = 0 and multipliers are not applied.
Daily quotas (contributor mode only)
| Signal type | Daily limit |
|---|---|
commit | 4 |
comment | 4 |
Signals beyond the daily limit score 0 points. Quotas are tracked per-user per-day within a scoring run.
Diminishing returns (contributor mode only)
Applied when a user exceeds weeklyThreshold signals of the same type within the current week.
| Parameter | Default value |
|---|---|
weeklyThreshold | 9 |
decayFactor | 0.11 (11% per excess signal) |
floorFraction | 0.2 (minimum 20% of base points) |
Formula:
if weeklyCount > weeklyThreshold:
excess = weeklyCount - weeklyThreshold
factor = max(0.2, 1 - (excess × 0.11))
points = base × factorExample: a user has 13 commits in the current week (weeklyThreshold = 9, decayFactor = 0.11).
excess = 13 − 9 = 4factor = max(0.2, 1 − (4 × 0.11)) = max(0.2, 0.56) = 0.56- A commit worth 10 base points scores
10 × 0.56 = 5.6points.
At excess = 8 (17 commits/week): factor = max(0.2, 1 − 0.88) = max(0.2, 0.12) = 0.2 — floor is reached.
Entity aggregation modes
Contributor (computeScores() in lib/scoring/engine.ts)
- Processes signals chronologically per user.
- Full scoring context: daily quotas, diminishing returns, all multipliers.
- Output:
entries[]sorted by score descending with sequential rank.
Repository (aggregateByRepository() in lib/scoring/aggregate.ts)
- Groups signals by repository.
skipQuota: true— no daily quotas, no diminishing returns.- Zero-point conditions and penalties still apply.
- Fresh scoring context per repository.
first_activitymultiplier is filtered out.
Team (aggregateByTeamSignals() in lib/scoring/aggregate.ts)
- Runs
computeScores()first to get per-user scores. - Maps users to teams via
teamMemberships(fetched byfetchOrgTeamsDataGraphQL). - Sums user scores per team.
- A user in multiple teams contributes their full score to each team.
- Deduplication within a single team (a user counts once per team regardless of how many team_members rows exist).
- Returns entries with
teamSlug,memberCount, andbreakdown.
Signal metadata
Each signal stores JSONB metadata (max 2048 bytes, enforced by DB CHECK constraint).
| Signal type | Metadata fields |
|---|---|
commit | sha, isInMergedPR, message, isBot |
pr_open, pr_merge, pr_close_no_merge | prNumber, title, body, linkedIssueIds, isSelfMerge, hasLinkedIssue, isBot |
issue_open | issueNumber, title, body, isBot |
issue_close | issueNumber, title, linkedPRNumber, hasLinkedPR, isBot |
review | prNumber, state, isSelfReview, body, isBot |
review_comment | commentId, prNumber, body, isBot |
comment | commentId, issueNumber, body, isBot |
Bot detection
A signal is flagged as bot activity when either of these is true:
author.type === 'Bot'login.endsWith('[bot]')
The isBot flag is set during normalization (normalizeGitHubData() in lib/scoring/normalize.ts) and stored in both the signal metadata JSONB and the users.is_bot column.
When the bot_activity zero-point condition is enabled (default: true), any signal with isBot = true scores 0 points.
Content hash and deduplication
normalizeGitHubData() produces a SHA-256 hash (first 32 characters) from:
SHA-256(event_type + user + repo + timestamp + identifier)The signals table has a unique index on (user_id, type, repository_id, event_timestamp, content_hash). upsertSignals() is fully idempotent — re-ingesting the same GitHub activity produces no duplicate rows.