Developer Archetypes

Unsupervised clustering of 9,896 top GitHub developers by 22 behavioral features — SkillBench Research Lab, UCSB

9,896

Developers analyzed

Behavioral features

Primary clusters

Total archetypes

Finding the Natural Groupings

We extracted 22 features from each developer's public GitHub profile — value dimensions, activity patterns, skill breadth, AI-era signals, and identity metadata — then ran k-means clustering to find natural groupings. Silhouette analysis peaks at k=4.

9,896 Developers in 2D

PCA projection of the 22-dimensional feature space. Clusters overlap in 2D because the real separation lives in higher dimensions — but the structure is visible.

Cluster Value Profiles

Radar charts showing each cluster's average score across six value dimensions, compared to the global mean.

Seasoned Builders

4,118 (41.6%)

High pre-AI history (78%), active cadence, broad collaboration. The backbone of open source.

Login	Name	Followers	Annual Contribs	Pre-AI	Reviews
torvalds	Linus Torvalds	287,983	3,264	87%	2
karpathy	Andrej	135,647	859	69%	6
yyx990803	Evan You	107,039	1,902	94%	17
gaearon	dan	90,422	836	78%	42
ruanyf	Ruan YiFeng	85,507	685	97%	0

Pre-AI Veterans (Quiet)

4,037 (40.8%)

Deep pre-AI foundations (91%) but sporadic current activity. Built things, then stepped back.

Login	Name	Followers	Annual Contribs	Pre-AI
gustavoguanabara	Gustavo Guanabara	109,928	8	57%
peng-zhihui	空心	86,188	0	97%
rafaballerini	Rafaella Ballerini	59,052	0	94%
tj	TJ	51,571	2	100%
IDouble	Alp ₿📈🚀🌕	50,276	24	95%

AI-Era Arrivals

1,268 (12.8%)

Low pre-AI ratio (23%), newer accounts (~5yr), moderate activity. Became visible after AI coding tools emerged.

Login	Name	Followers	Annual Contribs	Pre-AI	Reviews
dmalan	David J. Malan	35,793	71	0%	16
elyxdev	Elyx	26,763	63	0%	0
lucast1574	Lucas Santillan	21,120	377	0%	0
OracleBrain	Aashis Jha	15,368	32	7%	0
george0st	jist	13,508	1,109	0%	0

Minimal Evidence Profiles

473 (4.8%)

High follower counts but near-zero activity, languages, and domains. Famous but not building publicly.

Login	Name	Followers	Annual Contribs	Pre-AI
claude	Claude	46,597	0	100%
cumsoft	cumsoft	18,410	0	100%
metatimeofficial	Metatime	18,023	0	0%
PremChapagain	Prem Chapagain	14,157	4	33%
ghost	Deleted user	12,145	0	0%

Sub-Archetypes: AI-Era Arrivals

The AI-Era Arrivals cluster contains developers who joined or became active primarily after AI coding tools. Sub-clustering reveals three distinct sub-types.

Active AI-Native Builders

514 (41%)

Weekly cadence, highest quality in AI-Era group, some collaboration. Genuinely productive — but started in the AI era.

Login	Name	Followers	Annual Contribs	Pre-AI
george0st	jist	13,508	1,109	0%
jrohitofficial	Rohit Jha	12,721	648	35%
XiaomingX	Y11	11,465	3,702	0%
LaurieWired	LaurieWired	11,006	208	0%
elder-plinius	pliny	10,922	477	0%

Minimal Evidence Newcomers

415 (33%)

Accounts under 3 years, near-zero pre-AI commits, no code reviews. High social signal, low work-product signal.

Login	Name	Followers	Annual Contribs	Pre-AI
elyxdev	Elyx	26,763	63	0%
lucast1574	Lucas Santillan	21,120	377	0%
pewdiepie-archdaemon	PewDiePie	13,278	130	0%
elidianaandrade	Eli	12,423	253	5%
meliksahyorulmazlar	meliksahyorulmazlar	10,711	69	0%

Faded Practitioners

339 (27%)

Some pre-AI history (62%) but activity declined. Had foundations, went quiet.

Login	Name	Followers	Annual Contribs	Pre-AI	Reviews
dmalan	David J. Malan	35,793	71	0%	16
OracleBrain	Aashis Jha	15,368	32	7%	0
996icu	996icu	8,566	4	100%	0
JCSIVO	JCSIVO	7,512	26	73%	0
U7P4L-IN	U7P4L x C0D3R	7,455	2	100%	0

Sub-Archetypes: Seasoned Builders

The largest cluster splits into two recognizable types: individual contributors vs. community hubs.

Solo Veterans

~2,500 (61%)

Deep individual contributors, strong foundations, low review/collaboration activity. Domain experts in their lane.

Login	Name	Followers	Annual Contribs	Pre-AI	Reviews
torvalds	Linus Torvalds	287,983	3,264	87%	2
karpathy	Andrej	135,647	859	69%	6
yyx990803	Evan You	107,039	1,902	94%	17
gaearon	dan	90,422	836	78%	42
ruanyf	Ruan YiFeng	85,507	685	97%	0

Ecosystem Leaders

~1,600 (39%)

Highest collaboration breadth (30+), active code reviewers, daily cadence. Framework authors and open source maintainers.

Login	Name	Followers	Annual Contribs	Pre-AI	Reviews
alesanchezr	Alejandro Sanchez	2,068	12,605	86%	122
punkpeye	Frank Fiegel	1,709	7,995	2%	11

Complete Taxonomy

Archetype	N	%	Quality	Cadence	Pre-AI	Age	Top Exemplar
Solo Veterans	4,116	41.6%	0.76	3.1	78%	13y	torvalds
Ecosystem Leaders	2	0.0%	0.61	3.5	44%	10y	alesanchezr
Pre-AI Veterans (Quiet)	4,037	40.8%	0.68	1.1	91%	12y	gustavoguanabara
Active AI-Native Builders	514	5.2%	0.51	2.8	14%	5y	george0st
Minimal Evidence Newcomers	415	4.2%	0.36	1.7	2%	3y	elyxdev
Faded Practitioners	339	3.4%	0.40	1.1	62%	6y	dmalan
Minimal Evidence Profiles	473	4.8%	0.17	0.5	31%	5y	claude

Key finding: Among the top 10,000 GitHub users by follower count, 12.8% became prominent primarily in the AI era. Within that group, a third show minimal evidence of engineering depth — high social signal with low work-product signal. From public artifacts alone, these profiles are indistinguishable from legitimate developers who simply work in private repos.

The Telemetry Gap

This analysis shows what work products can tell you — and where they hit a wall.

Behavioral archetypes from public signal

Distinct patterns of quality, consistency, collaboration, and AI-era adaptation cluster reliably across ~10,000 profiles.

Decoupled social proof from evidence

High follower counts don't guarantee engineering depth. The data shows where these signals diverge.

Who actually wrote the code

Commits show who pushed, not who authored. A single commit may be 90% AI-generated or 100% handwritten — the git log looks identical.

Learning velocity

A developer who went from zero to competent in 6 months looks the same as one who plateau'd 5 years ago. Static snapshots miss trajectories.

AI delegation patterns

Did they use AI for boilerplate and write the hard parts themselves? Or prompt-engineer the entire feature? GitHub doesn't know.

Skill in context

Knowing Python as a software engineer is different from knowing Python as a business analyst. Repos show the what, not the how or why.

SkillBench closes this gap by instrumenting the development environment itself — turning the process of coding into a legible signal. These archetypes are the boot block. Telemetry turns them into living profiles.