You ran a camera on a ridge for three weeks and never photographed a lynx. Is there no lynx, or did you just miss it?
That question is the whole ballgame, and most analyses get it wrong by assuming the answer is obvious. They aren't. A blank detection record at an occupied site is not a bug; it's the expected outcome of a simple fact — your camera doesn't catch everything that walks past, and an animal that roams a home range far larger than your camera's field of view will spend most of its time somewhere your camera isn't looking. Treat that blank as a true absence and you've baked an error into everything downstream: your distribution map, your habitat model, your "the species is gone from this drainage" conclusion. The polite, precise name for this kind of record isn't "absence." Gary White's occupancy notes put it bluntly: these are presence/apparent-absence data, and "not detecting a species does not imply absence".
Occupancy modeling is the framework built to take that seriously. Instead of counting animals, it asks a more answerable question — is this site occupied or not? — and it estimates two things at once: ψ (psi), the probability a site is occupied, and p, the probability you detect the species on a given survey given that it's there. The trick that makes it work is almost embarrassingly simple: survey each site more than once. If you detect the species even once, you know that site was occupied the whole time, which means every blank survey at that site was a miss, not an absence — and the rate of those misses is exactly the information you need to estimate p, and then correct ψ for all the sites where you saw nothing at all.
This is a method piece for people who'll actually run it: researchers, game and wildlife managers, and the serious end of citizen science. We'll build up why naive occupancy is biased, how the single-season and dynamic models work, what the assumptions really demand of you, where camera traps specifically trip the model up, and which software to reach for. It's a close cousin of two other questions — how to place your cameras in the first place and how to turn detections into animal density — but it's a distinct one, and we'll keep it that way. If you want the array-design layer, see Camera Trap Survey Design: Spacing, Density, and Duration; if you want a number with animals-per-km² on it, see How to Estimate Wildlife Density From Camera Traps Without Marking Animals.
Why "naive occupancy" is the wrong starting line
Start with the temptation everyone feels. You've got a grid of camera sites and a record for each: did the target species show up or not? The obvious move is to call the proportion where it showed up your occupancy estimate. That's naive occupancy — and it's biased low, every time, by an amount you can't see and can't bound without more information.
Here's the mechanism, with no math yet. Suppose a species truly occupies 60 of your 100 sites. At the 60 occupied sites your camera detects it at, say, 40 of them and misses it at the other 20. Your naive estimate is 40/100 = 0.40. The truth was 0.60. You didn't measure occupancy; you measured occupancy times detectability, tangled together, and reported the product as if it were the thing you cared about. The lower your detection probability, the worse the gap — and detection probabilities are low. In a quantitative review of 537 ecology papers, Kellner and Swihart found that among studies that actually reported it, 70% had per-survey detection estimates below 0.5. Most surveys, most of the time, miss the species more often than they catch it.
It gets worse the moment you do anything interesting with the data. The classic demonstration is Gu and Swihart's "Absent or undetected?", which showed that when your probability of detecting a species is itself correlated with a habitat variable — and it usually is; animals are easier to photograph in some cover types than others — that correlation leaks into your estimated habitat coefficients and biases them, in either direction depending on the signs involved. In their small-mammal trapping illustration, non-detection error ran from 0% to 23% across seven species after five days of sampling. So the model that tells you "this species loves dense brush" might be telling you nothing more than "this species is easier to catch in dense brush." Same data, opposite meaning.
A naive occupancy estimate isn't occupancy — it's occupancy times detectability, reported as if the second number were 1.
That's the case for doing the extra work. Now the work itself.
The single-season model, built from one site
Everything starts with the static, single-season model: one snapshot in time, during which we assume a site's occupancy state doesn't change. Pick your sites, survey each one several times within a short enough window that nothing's moving in or out, and write down what you saw as a detection history — a string of 1s (detected) and 0s (not detected), one digit per survey.
The model has two moving parts. A site is occupied with probability ψ, or not, with probability 1−ψ. If it's occupied, you detect the species on each survey with probability p, or miss it with probability 1−p. And — this is the load-bearing assumption we'll come back to — if the site is unoccupied, you can't detect the species at all. From those pieces you can write down the probability of any detection history. The MARK textbook's worked version is the clearest I know. For the history "01" (missed on survey one, detected on survey two), the probability is:
ψ (1 − p₁) p₂ — the site is occupied, you missed it the first time, you caught it the second.
The interesting one is the all-zeros history, "00." You saw nothing. Does that mean the site is empty? The model refuses to assume so. Its probability is:
ψ (1 − p₁)(1 − p₂) + (1 − ψ).
Read that out loud, because it's the entire philosophy in one line: either the site was occupied and you missed the species on both surveys (the left term), or the site truly wasn't occupied (the right term). The "+" is a logical OR. A blank record is consistent with both stories, and the model keeps both alive, weighting them by how often you detect the species at sites where you did see it. That's where the correction comes from. Run this likelihood across all your sites at once and you get a ψ that accounts for the animals you didn't photograph.
Compare that to the old way — logistic regression of presence on habitat — which "assumes a species was absent at sites where it was not detected, and thus inference is known to be biased when the probability of detecting a species at a site is less than one". Occupancy modeling is, at heart, that one fix applied honestly.
In R, this is the `occu()` function in the unmarked package: you build an `unmarkedFrameOccu` object from your detection matrix and fit it with a double formula — one part for detection, one for occupancy. We'll come back to the software. First, the data those models eat, because for camera traps that's where the real decisions hide.

Turning a camera roll into a detection history
A bird survey comes pre-packaged into occasions: you visited the wetland Tuesday, Thursday, and Saturday — three surveys. A camera trap doesn't work that way. It runs continuously for weeks and hands you a pile of time-stamped photos. Before any occupancy model can touch that, you have to chop the deployment into discrete occasions yourself, and how you chop it changes your answer.
The tool most people use is camtrapR's `detectionHistory()`, which takes your records and an `occasionLength` and produces the detection/non-detection matrix that unmarked or its Bayesian cousins expect — and, importantly, lets you carry the camera's active effort along, so a station that ran ten days isn't treated like one that ran thirty. The judgment call is the occasion length, and Sollmann's primer lays out the trade-off cleanly:
- Too short (say, daily occasions) and your matrix fills with zeros. A wall of 0s drives the detection estimate toward zero and can break the model numerically.
- Too long and you throw away information — fifteen separate photographs of a fox at one site collapse into a single "1," and you've discarded everything the repeat visits were supposed to tell you about p.
- Keep it constant. Detection probability rises with occasion length, so if your occasions (or the effort within them) differ, you have to put occasion length or effort in as a covariate on detection or accept a biased p.
There's a subtler camera-specific trap that a 2024 paper by Goldstein and colleagues put a number on: temporal autocorrelation. An animal photographed at a camera today is more likely to be photographed there tomorrow — it's using that spot. Standard occupancy models assume your repeat detections are independent, and when they're not, the model reads the clustered detections as a higher detection rate than is real, then concludes that sites with no detections are probably empty — biasing occupancy downward. Their verdict is sobering: autocorrelation is "likely widespread in camera trap data and many previous studies of occupancy based on camera trap data may have systematically underestimated occupancy probabilities". Their fixes are practical and worth memorizing: test for it with a join-count goodness-of-fit test; if it's there, use larger detection windows to soak up the clustering; and do not try to fix it by leaving gaps between occasions — gaps don't remove the temporal structure and just throw away data.
Cut your camera deployment into occasions too finely and the model drowns in zeros; too coarsely and you throw away the repeat detections that make it work.
Detection probability is never constant — so model it

The naive picture assumes one detection probability for everything. The reality, from the same 537-paper review, is that when researchers bothered to test whether detection was constant, 86% of the time it significantly varied. Weather, season, the camera's position on or off a trail, the observer, the moon — all of it moves p around. The good news is the framework was built for this: you can model occupancy ψ as a function of site covariates (habitat, elevation, distance to a road) and **detection p as a function of site and survey covariates** (effort, temperature, season).
One quirk worth internalizing: because occupancy is assumed fixed within a season, the covariates you put on ψ should be things that don't change within that season — site-level, time-constant. Detection p, by contrast, can happily take time-varying covariates, because you measure it survey by survey.
Then there's a kind of detection heterogeneity you usually can't see and can't put in a covariate: variation driven by how many animals are actually at a site. A site with eight resident foxes will photograph "a fox" far more readily than a site with one passing through. Royle and Nichols turned that nuisance into a model. Their key insight is that variation in abundance induces variation in detection probability, through a clean relationship — the chance of detecting the species at a site is one minus the chance of missing every individual there:
pᵢ = 1 − (1 − r)^{Nᵢ}, where r is the per-individual detection probability and Nᵢ is the local abundance at site i.
More animals, higher site-level detectability. Run it the other way and you can back out a crude estimate of abundance from nothing but repeated detection/non-detection data, without ever marking an individual — that's the Royle–Nichols model. The deeper lesson, in Royle and Nichols' own words, is that ignoring this kind of heterogeneity is "a de facto assumption of constant abundance among sites," which is rarely what you mean. Royle's later work generalized the point: when detection varies among occupied sites and you leave it unmodeled, your occupancy estimate is biased low. Detection heterogeneity isn't a footnote. It's the second-biggest way these models go wrong, after pretending detection is perfect at all.
When occupancy changes: the dynamic model
A single-season model is a photograph. Sometimes you need the movie — is the species spreading into new country, or blinking out of places it used to be? That's the dynamic, multi-season model, MacKenzie and colleagues' 2003 extension and one branch of a model family that has since grown to cover multiple states, communities, and false positives. It adds two parameters that are the whole point:
- Colonization (γ, gamma): the probability a site that was unoccupied in one season becomes occupied in the next.
- Local extinction (ε, epsilon): the probability a site that was occupied becomes unoccupied.
The structure mirrors Pollock's robust design: you have primary periods (the "seasons," between which occupancy can change) and within each, secondary surveys (your repeat visits, during which it can't). A warning about the word "season," because it trips people up: in occupancy modeling it does not mean winter or summer. As one carnivore study states flatly, "in the occupancy model the word 'season' does not necessarily mean geographic season… it just means the period in which data were collected". Your seasons might be three consecutive years, or summer-and-winter repeated over a decade. The model doesn't care what you call them; it cares that occupancy is closed within each and open between them.
Two camera-trap studies show the model earning its keep. In a three-year survey of North Chinese leopard, leopard cat, and red fox in a Chinese nature reserve, the team used a dynamic model to ask whether each carnivore's occupancy was stable or shifting, and tied colonization and extirpation to elevation and human disturbance — finding occupancy fairly stable, with elevation mattering more than distance to villages or roads. They were candid about the central tension, too: dynamic models assume sites are closed during surveys but "may be 'open' between seasons," and that openness "introduces heterogeneity, which… causes estimate bias" that covariates can only partly absorb.
The most instructive example is an eleven-year, eight-species camera-trap dataset from northwestern Anatolia, Turkey. They structured it as 11 summer and 11 winter seasons — 22 primary periods — and modeled how each species' use of a site shifted with season, elevation, and human density, fitting it all in a Bayesian framework with JAGS and checking fit with a Bayesian p-value. Their finding was a clean argument for going dynamic in the first place: every species showed seasonality in habitat use, but the strength of it differed by species, so a single annual snapshot would have averaged away the very pattern they were after. If your animals shuffle around within the year — and most do — a static model quietly lies to you about a moving target.
"Season" in an occupancy model isn't winter or summer — it's just the window you decide occupancy holds still.
The assumptions, and what they cost you

This is where good practitioners separate from cargo-culters. Occupancy models rest on a short list of assumptions, and the discipline is knowing which one you're bending and what it does to your answer. The MARK chapter lists five; the book-length treatment, if you want every nuance and the proofs behind them, is MacKenzie and colleagues' Occupancy Estimation and Modeling.
| Assumption | What it means | What goes wrong if you break it |
|---|---|---|
| Closure | A site's occupied/unoccupied status doesn't change during a season's surveys | Estimates bias positive; or, if movement is random, you're estimating use, not occupancy |
| Modeled occupancy variation | Occupancy is constant across sites, or you've modeled the differences with covariates | Estimates represent an average; precision overstated |
| Modeled detection variation | Detection is constant across occupied sites, or modeled with covariates | Unmodeled heterogeneity biases occupancy low |
| Independence | Detection histories at different sites are independent | Occupancy roughly unbiased, but confidence intervals too narrow |
| No false positives | The species is never recorded where it's actually absent | Occupancy biased high; colonization and extinction distorted |
Three of these deserve a closer look for anyone working with cameras.
Closure, and the honest retreat to "use." A camera watches a few square meters of trail; a wolf's home range is hundreds of square kilometers. The animal is constantly moving in and out of your camera's view, which strictly violates closure. The field's mature, honest response is to reinterpret the parameter: when a wide-ranging species is randomly available like this, your occupancy estimates are still unbiased, but ψ should be read as the probability the site is "used" by the species, not strictly occupied. The Turkish study did exactly this, and said so plainly — because a camera site "may cover only a small portion of the home range of an individual," their analysis "estimates probability of use of a site, rather than of occupancy," so they wrote "use" throughout and renamed local extinction "desertion". This isn't a weakness to hide; it's the correct interpretation, and stating it is the mark of someone who understands the model.
Independence, and the camera-array trap. Here's one that bites camera people specifically. The MARK authors give the exact scenario: "if remote-cameras are located near each other, a single individual may be detected at multiple cameras (sites) during a given week (survey)". When that happens, your sites aren't independent — the same animal is generating "detections" at several of them — so the number of cameras overstates the number of independent units in your study. The damage is sneaky: your occupancy estimate often stays roughly right, but your uncertainty is understated — confidence intervals too narrow, a false sense of precision. The fix lives at the design stage: space cameras far enough apart for independence. (Which is a different question from this article, and a good one — Camera Trap Survey Design: Spacing, Density, and Duration.)
No false positives, and why AI raises the stakes. Every model above assumes that if you recorded the species, it was really there. Standard occupancy models bend over backward for false negatives — missing a present animal — but assume false positives never happen. They do. Miller and colleagues built the founding model for handling both error types after controlled experiments showed that even highly trained observers misidentify species as present when they're absent. Their warning is sharp: "bias in estimators of occupancy, colonization, and extinction can be severe when false positives occur," and unmodeled misidentification overestimates occupancy. This matters more now than when it was written, because automated species classifiers make exactly this error — confidently labeling a coyote as a wolf, a bobcat as a lynx. A misclassified detection is a false positive in the technical sense, and if you feed an unverified, auto-labeled detection history straight into an occupancy model, you may be modeling the classifier's mistakes as range expansion. The defensible workflow Miller's framework points to is to keep a subset of detections that are verified — a human-confirmed type for which false positives can be assumed absent — and let the model lean on those.
For wide-ranging animals on cameras, you're rarely estimating strict occupancy — you're estimating site use, and the honest papers say so.
More than one species at a time

If you've got a camera grid, you almost never have one species — you have a community, most of them photographed too rarely to model one at a time. Multi-species (community) occupancy models pool information across species so the data-rich ones help estimate the data-poor ones, while still correcting each for its own detection probability. Rich and colleagues' study of a mammal community in northern Botswana is a good flagship: 44 species over 6,607 trap nights, a mean species occurrence probability of 0.32, and a map of which habitats and protected areas held the most diversity — all while accounting for the fact that different species are photographed at wildly different rates.
The honest caveat comes from Guillera-Arroita and colleagues: community models can estimate species richness, including a guess at species never detected at all, but that guess about the undetected rests heavily on model structure and assumptions rather than on data you actually collected — so interpret community-level richness numbers with appropriate humility. The model is powerful; it is not a crystal ball for species your cameras never saw.
How many visits before "absent" means something?
This is the question managers actually ask: I surveyed and found nothing — now can I say the species isn't here? The occupancy answer is refreshingly concrete: you can never say "absent" with certainty, but you can make the probability of having missed a present species as small as you like, by stacking up detection chances. If your per-survey detection is p, the probability you'd miss an occupied site across k independent surveys is (1−p)^k — small p just means you need more k. The survey-design literature works this the practical way around: decide how confident you need to be, then allocate enough visits and per-visit effort to push your cumulative detection probability up to it. For cameras, "more visits" usually means "leave it out longer," which is the cheapest knob you have. And you don't always need a purpose-built survey: Wevers and colleagues built distribution models for wild boar and roe deer in the Swiss Jura out of incidental camera-trap "by-catch" — detections that piled up during winter wildlife surveys aimed at something else — by first modeling detectability and then correcting for it. The detections you already have can often answer an occupancy question, as long as you account for how imperfectly you collected them.
And when the stakes are high — a threatened species, a critical-habitat decision — the conservative posture is the one Bennett and colleagues argue for: because imperfect detection is the norm and we lack good detectability estimates for most priority species, "absence in surveys should be considered suggestive only," and critical habitat should be managed as occupied even when presence isn't confirmed. A non-detection is evidence, not proof, and how strong that evidence is depends entirely on how hard you looked.
You can never prove a species is absent — only make the odds of having missed it as small as you're willing to work for.
The software, briefly

You don't need to hand-code a likelihood. The ecosystem is mature, and which tool you pick mostly depends on how much you want to code and how big or complex your problem is.
| Tool | What it is | Reach for it when… |
|---|---|---|
| PRESENCE | The original standalone occupancy program (USGS, with MacKenzie); estimates patch occupancy and the dynamic/false-positive extensions. Has a point-and-click interface and an R bridge, RPresence | You'd rather not live in R, or want the program the methods papers were built around |
| unmarked (R) | The workhorse free R package; maximum-likelihood fitting of occupancy, Royle–Nichols, repeated-count, and distance models through one interface | You're in R and want fast, well-trodden single-species and dynamic models |
| camtrapR (R) | Not a model-fitter — the bridge. `detectionHistory()` turns time-stamped photos into the detection matrices the model packages need, effort included | You have raw camera records and need clean occupancy input |
| ubms (R) | unmarked's Bayesian twin — nearly the same interface, but fits in Stan, so you get random effects and full posterior inference | You need random effects or a Bayesian treatment but like unmarked's syntax |
| spOccupancy (R) | Modern Bayesian package for spatially explicit, multi-species, and data-integrated occupancy; scales to tens of thousands of sites via nearest-neighbor Gaussian processes | Your problem is big, spatial, multi-species, or fuses several datasets |
A typical camera-trap pipeline runs camtrapR → unmarked (or ubms / spOccupancy), with PRESENCE as the no-code alternative. If you're learning the framework from scratch, the USGS `occupancyTuts` package is a genuinely good on-ramp: 28 interactive tutorials covering single-season, dynamic, study design, and the spin-off models, with videos and exercises.
One soft note on workflow, because it's the bottleneck before any of this software runs: occupancy models need clean detection histories, and the slow part is almost always sorting thousands of photos to find the handful with your target species in them — and not mislabeling them, which, as we saw, is a false positive the model will take at face value.

Occupancy is a real quantity — don't quietly turn it into abundance
A closing discipline, because it's the most common abuse. Occupancy is a legitimate, useful quantity in its own right — the proportion of sites used, how it shifts with habitat, how it changes across years. What it is not is a stand-in for abundance. Sollmann is explicit: "occupancy should not be misinterpreted as an index of abundance," because the relationship between the two depends on a species' range size and behavior. A site can read "occupied" whether one animal or fifty use it. If the number you actually need is animals per square kilometer, occupancy is the wrong model and you want the density family instead How to Estimate Wildlife Density From Camera Traps Without Marking Animals. Use occupancy for what it's brilliant at — where a species is, and where it's going — and let it stop there.
Frequently asked questions
What is an occupancy model in plain terms?
It's a way to estimate the probability that a species is present at a site while accounting for the fact that you don't always detect it when it is. You survey each site repeatedly and the model separates "where the species actually is" (occupancy, ψ) from "how likely you are to see it when it's there" (detection probability, p).
Why isn't a non-detection the same as an absence?
Because detection is imperfect — your camera or survey misses animals that are present, especially when they're rare or roam a large home range. A blank record is consistent with both "not there" and "there but missed," so treating it as a true absence systematically underestimates how widespread a species is. Across hundreds of studies, most per-survey detection probabilities were below 0.5.
How many times do I need to survey a site before I can call it empty?
There's no number that proves absence, but you can drive the chance of a false "absent" as low as you need by adding surveys or effort: across k independent visits, the probability of missing an occupied site is (1−p)^k, so low detection just means more visits. For high-stakes decisions, the conservative stance is to treat a non-detection as suggestive only.
What's the difference between a single-season and a dynamic occupancy model?
A single-season (static) model assumes occupancy doesn't change during the survey period — a snapshot of where the species is. A dynamic (multi-season) model lets occupancy change between seasons and estimates the rates of that change: colonization (sites becoming occupied) and local extinction (sites becoming unoccupied).
Do occupancy models really work for wide-ranging animals on camera traps?
Yes, but with a reinterpretation. Because a camera covers only a sliver of a large home range, the animal moves in and out of view and strict "closure" is violated; the standard, honest fix is to read your estimate as the probability the site is used by the species rather than strictly occupied — the estimate stays valid, the label changes.
Can AI species recognition cause problems in occupancy analysis?
It can, if you don't verify it. A classifier that mislabels one species as another creates a false-positive detection — a species "recorded" where it's absent — and unmodeled false positives bias occupancy estimates upward and distort colonization and extinction. The fix is to verify detections (or keep a confirmed subset the model can trust) before building detection histories.