Chem-Bench extends our bottom-up knowledge-graph reasoning evaluation to chemistry. The suite is built from a curated chemistry knowledge graph and exposes a balanced 30,000-question pool — 10,000 questions per hop across 1-, 2-, and 3-hop reasoning chains.
To keep the experience fresh, each week serves a 50-question session per hop for a 40-week cycle. Subsequent cycles deterministically reshuffle the per-hop pools so users see new content over time.
Single-step reasoning along one chemistry KG relation (e.g. has_role, has_functional_parent)
Two-step compositional reasoning across linked chemical concepts
Three-step chains spanning structural, functional, and role relations
10,000 per hop · 50 per hop served each week · 40-week cycle
Choose any of the 1-Hop, 2-Hop, or 3-Hop bins. Each selection runs a 50-question session drawn from that hop’s pool for the current week.
Each item is a four-option chemistry question grounded in a KG path — covering molecules, functional parents, conjugate acid/base relations, and biological/chemical roles.
After each answer we visualise the underlying k-hop KG chain that links the source concept to the correct answer.
A new 150-question set (50 per hop) unlocks every Monday. Each cycle is 40 weeks; subsequent cycles reshuffle the per-hop pools for fresh content.
Monitor your running score and per-hop performance as you work through the week’s items.
Pick a k-hop reasoning level. Each selection runs a 50-question session drawn from this week's set.