Scan Depth and Time Window Configuration¶

OSS Sustain Guard provides flexible data sampling controls to balance analysis depth, speed, and API usage.

Scan Depth Options¶

Control how much data is collected from GitHub/GitLab APIs with the --scan-depth option:

Shallow (`--scan-depth shallow`)¶

Quick scan with minimal data collection (~50% of default samples).

Use when:

Quick health checks
CI/CD pipelines with strict time limits
Rate limit concerns
Large number of packages to analyze

Sample sizes:

Commits: 50
Merged PRs: 20
Closed PRs: 20
Open Issues: 10
Closed Issues: 20
Releases: 5
Reviews: 3

Example:

os4g check --scan-depth shallow
os4g check django flask --scan-depth shallow

Default (`--scan-depth default`)¶

Balanced sampling for typical analysis (default behavior).

Use when:

Standard dependency audits
General sustainability evaluation
Most use cases

Sample sizes:

Commits: 100
Merged PRs: 50
Closed PRs: 50
Open Issues: 20
Closed Issues: 50
Releases: 10
Reviews: 10

Example:

os4g check  # default depth is used
os4g check --scan-depth default  # explicit

Deep (`--scan-depth deep`)¶

Comprehensive analysis with maximum data collection (~2x default samples).

Use when:

Detailed investigation of specific projects
Critical dependency evaluation
Research and in-depth analysis
You have generous API rate limits

Sample sizes (API limits: 100 per query):

Commits: 100
Merged PRs/MRs: 100
Closed PRs/MRs: 100
Open Issues: 50
Closed Issues: 100
Releases: 20
Reviews: 20

Example:

os4g check requests --scan-depth deep
os4g check --recursive --scan-depth deep

Very Deep (`--scan-depth very_deep`)¶

Maximum detail analysis with highest sample counts for all data types.

Use when:

Critical security audits
Extensive research projects
Maximum data collection needed
Comprehensive historical analysis
API rate limits are not a concern

Sample sizes (API limits: 100 per query):

Commits: 100
Merged PRs/MRs: 100
Closed PRs/MRs: 100
Open Issues: 100
Closed Issues: 100
Releases: 50
Vulnerability Alerts: 50 (GitHub only)
Forks: 100
Reviews: 50

Differences from Deep:

2x more open issues (50 → 100)
2.5x more releases (20 → 50)
2.5x more vulnerability alerts (20 → 50, GitHub only)
2x more forks (50 → 100)
2.5x more reviews (20 → 50)

Example:

os4g check critical-dependency --scan-depth very_deep
os4g check --scan-depth very_deep --days-lookback 180

Note: Both GitHub GraphQL API and GitLab REST API limit to 100 records per query for most data types. Very deep mode maximizes data collection for all fields while respecting these constraints.

Time Window Filtering¶

Limit analysis to recent activity with the --days-lookback option.

Usage¶

# Analyze only the last 30 days
os4g check --days-lookback 30

# Last 3 months (90 days)
os4g check --days-lookback 90

# Last 6 months (180 days)
os4g check --days-lookback 180

# Last year (365 days)
os4g check --days-lookback 365

When to Use Time Filtering¶

Short windows (30-90 days):

Focus on recent project activity
Evaluate current maintainer responsiveness
Check if a project is actively maintained
Fast-moving projects where older data is less relevant

Medium windows (90-180 days):

Seasonal projects
Balanced view of recent trends
Most general-purpose analyses

Long windows (180-365 days):

Projects with slower release cycles
Academic or research projects
Comprehensive historical analysis

No time limit (default):

Full project history within sample limits
Best for comprehensive evaluation
Recommended for most analyses

How Time Filtering Works¶

When you specify --days-lookback N:

API-level filtering - GitHub/GitLab APIs receive 'since' parameter to fetch only recent data
Data is collected according to scan depth (shallow/default/deep/very_deep) within the time window
Metrics calculated based on the time-filtered data

Example: With --scan-depth deep --days-lookback 90:

API fetches commits from the last 90 days only (API-level filtering)
Up to 100 commits within that period (deep mode limit)
Calculate metrics based on the 90-day window

Benefits of API-level filtering:

More efficient: Only relevant data is fetched
Better accuracy: True representation of time window activity
No missed data: Not limited by scan depth for recent activity

Note: For some data types (issues, PRs), client-side filtering supplements API-level filtering to ensure accuracy.

Combining Options¶

You can combine scan depth and time windows for targeted analysis:

# Quick check of recent activity
os4g check --scan-depth shallow --days-lookback 30

# Deep dive into last quarter
os4g check important-package --scan-depth deep --days-lookback 90

# Comprehensive recent analysis across all deps
os4g check --recursive --scan-depth deep --days-lookback 180

# Fast CI check of last month
os4g check --scan-depth shallow --days-lookback 30 --output-style compact

Performance Considerations¶

API Rate Limits¶

Shallow: ~40% fewer API calls than default
Default: Balanced API usage
Deep: Similar API calls to default (more data per call)
Very Deep: Similar API calls, but larger responses

GitHub API rate limits (with token):

5,000 requests per hour
Each package analysis uses 1-2 requests
Scan depth affects response size, not request count

Verbose Output¶

Use --verbose to see scan configuration:

os4g check --scan-depth deep --days-lookback 90 --verbose

Output shows:

📊 Scan depth: deep
📅 Time window: last 90 days
🔍 Analyzing 10 package(s)...

Best Practices¶

For CI/CD Pipelines¶

# Fast, focused check
os4g check --scan-depth shallow --days-lookback 30 --output-style compact

For Regular Audits¶

# Balanced, comprehensive
os4g check --scan-depth default --days-lookback 90

For Deep Investigation¶

# Thorough analysis
os4g check specific-package --scan-depth deep --output-style detail

For Large Projects¶

# Efficient recursive scanning
os4g check --recursive --scan-depth shallow --days-lookback 60

Cache Behavior¶

Scan depth and time window settings do not invalidate cache:

Cache stores raw data samples
Settings control what data is fetched and filtered
Different settings may reuse cached data
Use --no-cache to force fresh analysis

Example:

# First run: fetches and caches data (default depth)
os4g check requests

# Second run: uses cache, same data
os4g check requests --scan-depth shallow

# Third run: forces fresh fetch with deep sampling
os4g check requests --scan-depth deep --no-cache

Configuration File Support¶

Currently, scan depth and time window are CLI-only options. Future versions may support configuration file settings:

# Future: .oss-sustain-guard.toml
[tool.oss-sustain-guard]
scan_depth = "deep"
days_lookback = 90

Troubleshooting¶

"Rate limit exceeded" errors¶

Solution: Use --scan-depth shallow or analyze fewer packages:

os4g check --scan-depth shallow

Analysis too slow¶

Solutions:

Use --scan-depth shallow
Add --days-lookback to focus on recent data
Reduce number of packages
Use --no-cache less frequently

Insufficient data warnings¶

If you see "Not enough data for metric X":

Solutions:

Use --scan-depth deep to collect more samples
Remove or increase --days-lookback
Check if project is actually active

Examples¶

Quick Health Check¶

os4g check --scan-depth shallow --output-style compact

Monthly Review¶

os4g check --days-lookback 30 --output-format html --output-file monthly-report.html

Comprehensive Audit¶

os4g check --scan-depth deep --output-style detail --verbose

Maximum Detail Analysis¶

os4g check critical-package --scan-depth very_deep --days-lookback 180 --output-style detail

CI/CD Integration¶

os4g check --scan-depth shallow --days-lookback 30 --no-cache --output-style compact

Scan Depth and Time Window Configuration¶

Scan Depth Options¶

Shallow (--scan-depth shallow)¶

Default (--scan-depth default)¶

Deep (--scan-depth deep)¶

Very Deep (--scan-depth very_deep)¶

Time Window Filtering¶

Usage¶

When to Use Time Filtering¶

How Time Filtering Works¶

Combining Options¶

Performance Considerations¶

API Rate Limits¶

Verbose Output¶

Best Practices¶

For CI/CD Pipelines¶

For Regular Audits¶

For Deep Investigation¶

For Large Projects¶

Cache Behavior¶

Configuration File Support¶

Troubleshooting¶

"Rate limit exceeded" errors¶

Analysis too slow¶

Insufficient data warnings¶

Examples¶

Quick Health Check¶

Monthly Review¶

Comprehensive Audit¶

Maximum Detail Analysis¶

CI/CD Integration¶

See Also¶

Shallow (`--scan-depth shallow`)¶

Default (`--scan-depth default`)¶

Deep (`--scan-depth deep`)¶

Very Deep (`--scan-depth very_deep`)¶