Bot Detection Customization Guide¶
Overview¶
OSS Sustain Guard uses intelligent bot detection to exclude automated accounts (CI/CD systems, dependabot, etc.) from sustainability metrics calculations. This ensures that metrics accurately reflect human contributor activity.
How Bot Detection Works¶
Bot detection operates in a multi-stage process:
Stage 1: Exact Pattern Matching (Most Reliable)¶
Matches against known bot account patterns from GitHub, GitLab, and other services:
dependabot[bot]github-actions[bot]renovate[bot]dependabot-preview[bot]- And many others...
Stage 2: Email Domain Detection¶
Checks if the commit author's email belongs to a bot service:
*@noreply.github.com*@users.noreply.github.com*@gitlab.com
Stage 3: Keyword-based Matching (Fallback)¶
Checks if the login/name contains common bot keywords:
- Contains "bot"
- Contains "action"
- Contains "ci-"
- And others...
Stage 4: Custom Exclusion List¶
Allows you to explicitly mark specific users as bots through configuration.
Configuration¶
Adding Custom Excluded Users¶
You can configure OSS Sustain Guard to treat specific accounts as bots using either .oss-sustain-guard.toml or pyproject.toml.
Using .oss-sustain-guard.toml (Recommended)¶
Create or edit .oss-sustain-guard.toml in your project root:
[tool.oss-sustain-guard]
# Exclude specific users from contributor metrics
exclude-users = [
"my-internal-ci-user",
"release-automation",
"internal-bot-account",
]
Using pyproject.toml¶
Add to your pyproject.toml:
[tool.oss-sustain-guard]
exclude-users = [
"my-internal-ci-user",
"release-automation",
]
Common Use Cases¶
1. Internal CI/CD Accounts¶
If your organization uses internal CI/CD systems that commit under a specific account:
[tool.oss-sustain-guard]
exclude-users = ["jenkins-bot", "gitlab-runner", "company-ci"]
2. Release Automation¶
If you have a dedicated bot account for automatic releases:
[tool.oss-sustain-guard]
exclude-users = ["autorelease-bot", "version-bumper"]
3. Documentation Generators¶
If you use automated tools that commit generated files:
[tool.oss-sustain-guard]
exclude-users = ["docgen-bot", "changelog-generator"]
Examples¶
Example: Python Project with Internal Bot¶
.oss-sustain-guard.toml:
[tool.oss-sustain-guard]
# Exclude both built-in bot patterns and our internal CI account
exclude-users = ["internal-ci-system"]
exclude = ["test-fixtures", "example-packages"]
Now when analyzing your project:
dependabot[bot]will be automatically excluded (built-in pattern)github-actions[bot]will be automatically excluded (built-in pattern)internal-ci-systemwill be excluded (custom configuration)- Only genuine human contributors will be counted
Example: Monorepo with Multiple Bot Systems¶
.oss-sustain-guard.toml:
[tool.oss-sustain-guard]
exclude-users = [
"jenkins-automation",
"gha-deployer",
"changelog-bot",
"security-scanner-bot",
]
Troubleshooting¶
Issue: A real user is being excluded as a bot¶
If a legitimate contributor's name contains a bot keyword (e.g., "robotics-expert"), the keyword-based detection might incorrectly classify them as a bot.
Solution: Use the exact pattern matching by ensuring their account doesn't match any known patterns, or contact your VCS administrator if the name pattern can be changed.
Alternatively, you can manually verify by checking the VCS API directly.
Issue: A known bot is not being excluded¶
If a bot account is not in the default list and doesn't match keyword patterns:
- Check if it should be added to the default patterns (report an issue)
- Add it to your
exclude-usersconfiguration
Example: If your organization uses a custom bot acme-corp-bot:
[tool.oss-sustain-guard]
exclude-users = ["acme-corp-bot"]
Built-in Bot Patterns¶
The following bots are automatically recognized without additional configuration:
GitHub Bots¶
dependabot[bot]github-actions[bot]renovate[bot]snyk-bot[bot]codecov[bot]coveralls[bot]- And others...
GitLab Bots¶
dependabotrenovate-botgitlab-runner
Email Domain Patterns¶
- Any email ending in
@noreply.github.com - Any email ending in
@users.noreply.github.com - Any email ending in
@gitlab.com
Keyword Patterns (Fallback)¶
- Logins containing "bot"
- Logins containing "action"
- Logins containing "ci-"
- Logins containing "copilot"
- And others...
Impact on Metrics¶
Bot exclusion affects the following metrics:
- Contributor Redundancy (Bus Factor): Excluded from contributor count
- Maintainer Retention: Excluded from maintainer analysis
- Contributor Retention: Excluded from retention calculations
- Contributor Attraction: Excluded from new contributor count
- Organizational Diversity: Excluded from diversity analysis
- Contributor Count Signal: Excluded from contributor count
Best Practices¶
-
Start with defaults: The built-in patterns cover most common bots. Only add custom exclusions when necessary.
-
Document your choices: Comment your configuration to explain why specific accounts are excluded.
-
Review periodically: As your automation tools change, update your configuration accordingly.
-
Be conservative: Only exclude accounts you're certain are bots. False negatives (missing a bot) are better than false positives (incorrectly excluding humans).
-
Test your configuration: Run
os4g check --demoto see how bot detection affects your metrics.