Draft
This article is a draft.
Please do not share or link to this URL until I remove this notice
2024 Analysis of Traffic to martinfowler.com
21 March 2024
Although I do keep an eye on traffic to the site, especially with newly published articles, it's been a while since I've done a proper overview of the traffic to martinfowler.com. (The previous ones were in 2014 and 2018.) So in between my editing work, I've spent some time looking through the traffic data. Currently this page summarizes what I've understood so far, but I'll be adding more sections as I probe further.
Traffic peaked around 2020
Here's the plot of monthly page views since I started to track traffic in 2011.
We see the overall traffic flow shows a steady increase from 2011 to 2020, followed by a similarly steady decrease. Traffic levels in 2023 were about the same as 2015.
If you look carefully at the plot, you might notice there are points
missing in 2019. This is due to a strange phenomenon in the data I got
from Google. During a few months in 2019 the traffic of some of the
navigation pages mentioned in my navigation bar (eg
/agile.html
) shot up from around 5000 monthly views to
221,000. It only affected a handful of pages, but a jump that big throws
off monthly (and yearly) totals.
Here's the totals for the year.
year | median monthly | total for year |
---|---|---|
2011 | 306,706 | 5,334,888,024 |
2012 | 356,570 | 8,492,974,172 |
2013 | 409,300 | 9,094,607,474 |
2014 | 492,169 | 11,505,310,904 |
2015 | 594,879 | 16,060,940,907 |
2016 | 628,729 | 18,600,995,621 |
2017 | 647,936 | 24,236,927,419 |
2018 | 717,882 | 26,818,308,778 |
2019 | 801,846 | NA† |
2020 | 759,154 | 37,272,861,281 |
2021 | 711,292 | 34,080,092,653 |
2022 | 649,330 | 27,707,877,323 |
2023 | 597,662 | 19,093,317,894 |
† due to the strange google phenomenon above, the annual total for 2019 is meaningless
It's interesting to see the overall traffic, but that's only the start, and probably not the most interesting information about what's happening with the site. But to delve further, we need to understand a bit more about the general shape of the website's traffic.
Traffic drops rapidly after publication
The first, important, general pattern about this site (and I suspect most websites) is the traffic pattern for any page after it's released.
The following graph plot shows how many views an average article on the site gets on the days following its publication.
The thing to note here is the extreme drop off from the traffic after the first days. Not just is it an exponential curve, it's an exponential curve when plotted on logarithmic scale.
To plot an “average article”, I'm actually looking at the every article published since 2020, finding the page views for each day after publication, and find the median for each day.
The second thing to note is that things tend to stabilize after about six months.
Traffic is lower at weekends (and Christmas)
You'll notice that I've said weekdays above. The next graph shows why.
Here I've plotted the daily traffic, and color-coded between weekdays and weekends. The graph clearly shows how weekend traffic is much lower than weekday traffic. An example consequence of this is that if I want to compare articles for their post-launch traffic, I need to stick to weekdays, otherwise there will be strange differences between articles published on Tuesdays and articles published on Thursdays.
There's another traffic variation, which isn't obvious in the graphic above, but shows up if I just plot weekday daily traffic
Here we can see a notable drop-off at the end of year, due to the Christmas and new year holidays. This is why monthly figures for December are always lower than the rest of the year.
A small amount of articles get most of the traffic
How much traffic do pages on the site get? To start answering this
question, I actually have to ask what counts as a page on the site. If I
look at a list traffic counts, I'll see a lot of paths that are requested
that don't lead to real pages, such as
/articles/colelction-pipeline
or
/articles/cmljaGFyZH
. Those requests are actually a
remarkably large proportion of requested pages on the site. In January
2024, 936 or the 2506 paths requested for the site were unknown
Fortunately when it comes to overall traffic analysis, I don't have to worry about it that much, because the amount of requests to these paths is rather smaller
It turns out to be about 0.5%
From that I think it's reasonable to concentrate on the known paths, which in January 2024, numbered some 1570 URLs.
My usual first step when looking at something like is to plot a histogram, so I can get a sense for the frequency distribution of the data, in this case the traffic for each of these pages in the month of January.
What this tells me is that we have a really skewed distribution, skewed even though I'm plotting it here with a square root y axis. I can tell that most pages get very few views, but a few pages get a lot.
That's useful information, but I want a bit more granularity than that. I can do this by plotting the cumulative traffic - which I do by sorting the pages by how many views they get, calculating the cumulative traffic for each page, and making this plot
Each dot on this plot is a single page, towards the right of the plot all the dots squish together into a line. I can use this by following the 10% line from the y axis and seeing that it hits the squished points at roughly 520K page views. What this tells me is that top 10% of pages on the site generated roughly 520K page views. By looking at the initial dots I can see that 150K page views were generated by the top 5 pages.
Seeing this in terms of page views helps us understand the graph, but once we've got the hang of it, I find it more useful to think in terms of proportion of page views.
This is exactly the same curve, but now I read that the top 10% of pages generated just over 80% of the traffic on the site.
I don't know enough about statistics to know what how a real data scientist would react to a plot like this. It's very similar to a Cumulative Distribution Function.
I can also show the main points in a table.
proportion of pages | proportion of traffic | cumulative traffic |
---|---|---|
0% | 9% | 57K |
5% | 70% | 441K |
10% | 82% | 519K |
15% | 88% | 555K |
20% | 91% | 576K |
25% | 94% | 590K |
30% | 95% | 600K |
The top pages on the site
Here's a table of the top 10% of pages on the site for 2023
rank | path | total | median | launch_date | type |
---|---|---|---|---|---|
1 | / | 512497 | 40501 | nav | |
2 | /architecture/ | 232748 | 20512 | nav | |
3 | /articles/practical-test-pyramid.html | 183933 | 15371 | 2018-02-14 | post |
4 | /articles/microservices.html | 156410 | 12889 | 2014-03-10 | post |
5 | /articles/data-mesh-principles.html | 151849 | 12356 | 2020-12-03 | post |
6 | /bliki/CQRS.html | 146743 | 12000 | 2011-07-14 | post |
7 | /refactoring/ | 144215 | 11276 | nav | |
8 | /articles/micro-frontends.html | 141922 | 11704 | 2019-06-10 | post |
9 | /articles/2023-chatgpt-xu-hao.html | 141471 | 4400 | 2023-04-13 | post |
10 | /agile.html | 131963 | 9579 | nav | |
11 | /articles/feature-toggles.html | 129180 | 10634 | 2016-01-19 | post |
12 | /articles/data-monolith-to-mesh.html | 128921 | 10584 | 2019-05-13 | post |
13 | /articles/modularizing-react-apps.html | 121523 | 5627 | 2023-02-07 | post |
14 | /aboutMe.html | 119872 | 8308 | nav | |
15 | /eaaDev/EventSourcing.html | 94263 | 7865 | 2005-12-12 | post |
16 | /articles/patterns-of-distributed-systems/ | 92676 | 6614 | 2020-08-04 | post |
17 | /articles/richardsonMaturityModel.html | 91658 | 7390 | 2010-03-18 | post |
18 | /bliki/BoundedContext.html | 90091 | 7572 | 2014-01-15 | post |
19 | /articles/mocksArentStubs.html | 85243 | 7034 | 2007-01-02 | post |
20 | /refactoring/catalog/ | 83446 | 7102 | other | |
21 | /articles/injection.html | 78452 | 6652 | 2004-01-23 | post |
22 | /bliki/StranglerFigApplication.html | 67297 | 5408 | 2004-06-29 | post |
23 | /articles/branching-patterns.html | 65644 | 5404 | 2020-04-20 | post |
24 | /bliki/DomainDrivenDesign.html | 65564 | 5504 | 2020-04-22 | post |
25 | /bliki/TwoHardThings.html | 65533 | 5400 | 2009-07-14 | post |
26 | /articles/microservice-testing/ | 65136 | 5238 | 2014-11-18 | post |
27 | /bliki/CircuitBreaker.html | 63611 | 5300 | 2014-03-06 | post |
28 | /articles/exploring-gen-ai.html | 62640 | 8031 | 2023-07-26 | post |
29 | /books/refactoring.html | 52920 | 4466 | other | |
30 | /bliki/ConwaysLaw.html | 52147 | 3512 | 2022-10-20 | post |
31 | /eaaCatalog/ | 49413 | 4056 | other | |
32 | /books/eaa.html | 47224 | 4068 | other | |
33 | /microservices/ | 45399 | 3990 | 2015-07-08 | nav |
34 | /bliki/GivenWhenThen.html | 44623 | 3784 | 2013-08-21 | post |
35 | /articles/on-pair-programming.html | 43545 | 3540 | 2020-01-15 | post |
36 | /bliki/AnemicDomainModel.html | 43148 | 3542 | 2003-11-25 | post |
37 | /bliki/DDD_Aggregate.html | 40937 | 3334 | 2013-04-23 | post |
38 | /articles/headless-component.html | 40130 | 5402 | 2023-11-01 | post |
39 | /tags/domain%20driven%20design.html | 37907 | 3901 | other | |
40 | /bliki/Yagni.html | 35378 | 2855 | 2015-05-26 | post |
41 | /articles/201701-event-driven.html | 34880 | 2864 | 2017-02-07 | post |
42 | /eaaCatalog/repository.html | 34717 | 2820 | other | |
43 | /articles/platform-teams-stuff-done.html | 33964 | 2643 | 2023-07-19 | post |
44 | /bliki/TeamTopologies.html | 33333 | 2331 | 2023-07-25 | post |
45 | /eaaCatalog/dataTransferObject.html | 32927 | 2745 | other | |
46 | /articles/building-boba.html | 31979 | 1089 | 2023-06-29 | post |
47 | /articles/continuousIntegration.html | 31961 | 2779 | 2024-01-18 | post |
48 | /articles/is-quality-worth-cost.html | 31737 | 2398 | 2019-05-29 | post |
49 | /articles/linking-modular-arch.html | 31028 | 1087 | 2023-06-13 | post |
50 | /bliki/TestPyramid.html | 30775 | 2559 | 2012-05-01 | post |
51 | /bliki/UbiquitousLanguage.html | 30379 | 2550 | 2006-10-31 | post |
52 | /books/ | 30194 | 2408 | other | |
53 | /bliki/BlueGreenDeployment.html | 29788 | 2288 | 2010-03-01 | post |
54 | /articles/patterns-of-distributed-systems/two-phase-commit.html | 29786 | 2422 | 2022-01-18 | post |
55 | /bliki/PageObject.html | 29599 | 2471 | 2013-09-10 | post |
56 | /bliki/TestDouble.html | 29245 | 2400 | 2006-01-17 | post |
57 | /bliki/MonolithFirst.html | 28956 | 2304 | 2015-06-03 | post |
58 | /articles/lmax.html | 28532 | 2418 | 2011-07-12 | post |
59 | /articles/serverless.html | 28095 | 2174 | 2016-06-15 | post |
60 | /articles/break-monolith-into-microservices.html | 27830 | 2184 | 2018-04-24 | post |
61 | /articles/scaling-architecture-conversationally.html | 27672 | 2274 | 2021-11-30 | post |
62 | /articles/ship-show-ask.html | 27287 | 2170 | 2021-09-08 | post |
63 | /bliki/CanaryRelease.html | 26455 | 2175 | 2014-06-25 | post |
64 | /eaaCatalog/unitOfWork.html | 26182 | 2181 | other | |
65 | /bliki/UnitTest.html | 25316 | 2124 | 2014-05-05 | post |
66 | /bliki/TellDontAsk.html | 24753 | 2075 | 2013-09-05 | post |
67 | /bliki/TechnicalDebtQuadrant.html | 24350 | 1890 | 2014-11-19 | post |
68 | /articles/consumerDrivenContracts.html | 24169 | 1999 | 2006-06-12 | post |
69 | /bliki/ValueObject.html | 23435 | 1919 | 2016-11-14 | post |
70 | /bliki/Slack.html | 23184 | 351 | 2023-04-04 | post |
71 | /bliki/RulesEngine.html | 21285 | 1750 | 2009-01-07 | post |
72 | /eaaCatalog/domainModel.html | 21275 | 1728 | other | |
73 | /testing/ | 20999 | 1763 | nav | |
74 | /articles/evodb.html | 20287 | 1623 | 2016-05-01 | post |
75 | /articles/cd4ml.html | 19910 | 1620 | 2019-09-03 | post |
76 | /articles/developer-effectiveness.html | 19424 | 1599 | 2021-01-05 | post |
77 | /articles/retrospective-antipatterns.html | 19422 | 398 | 2023-02-15 | post |
78 | /articles/patterns-legacy-displacement/ | 19098 | 1536 | 2021-07-20 | post |
79 | /articles/products-over-projects.html | 19092 | 1570 | 2017-11-16 | post |
80 | /articles/dependency-composition.html | 19030 | 772 | 2023-05-23 | post |
81 | /eaaDev/uiArchs.html | 18841 | 1468 | 2006-07-18 | post |
82 | /bliki/PresentationDomainDataLayering.html | 18687 | 1382 | 2015-08-26 | post |
83 | /articles/2023-chatgpt-tech-writing.html | 17167 | 888 | 2023-04-26 | post |
84 | /bliki/IntegrationTest.html | 17003 | 1416 | 2018-01-16 | post |
85 | /bliki/ContractTest.html | 16690 | 1356 | 2011-01-12 | post |
86 | /articles/bottlenecks-of-scaleups/03-product-v-engineering.html | 16321 | 1304 | 2022-10-10 | post |
87 | /dsl.html | 15741 | 1300 | nav | |
88 | /articles/creating-integrated-tech-strategy.html | 15735 | 685 | 2023-08-08 | post |
89 | /bliki/TechnicalDebt.html | 15642 | 1332 | 2019-05-21 | post |
90 | /articles/patterns-of-distributed-systems/paxos.html | 15592 | 1260 | 2022-01-05 | post |
91 | /eaaCatalog/transactionScript.html | 15465 | 1262 | other | |
92 | /articles/eurogames/ | 15158 | 1174 | 2013-10-02 | post |
93 | /eaaCatalog/serviceLayer.html | 14990 | 1249 | other | |
94 | /articles/2023-social-media.html | 14598 | 261 | 2023-11-02 | post |
95 | /articles/talk-about-platforms.html | 14568 | 1150 | 2018-03-05 | post |
96 | /refactoring/catalog/extractFunction.html | 14516 | 1147 | other | |
97 | /refactoring/catalog/replaceNestedConditionalWithGuardClauses.html | 13684 | 1202 | other | |
98 | /articles/patterns-of-distributed-systems/wal.html | 13392 | 1236 | other | |
99 | /bliki/CannotMeasureProductivity.html | 12785 | 534 | 2013-08-29 | post |
100 | /articles/agile-threat-modelling.html | 12781 | 1067 | 2020-05-18 | post |
101 | /bliki/TestDrivenDevelopment.html | 12752 | 828 | 2023-12-11 | post |
102 | /articles/bottlenecks-of-scaleups/04-costs.html | 12717 | 862 | 2023-07-31 | post |
103 | /bliki/ObjectMother.html | 12564 | 996 | 2006-10-24 | post |
104 | /articles/2021-test-shapes.html | 12511 | 1041 | 2021-06-02 | post |
105 | /bliki/CommandQuerySeparation.html | 12498 | 1010 | 2005-12-05 | post |
106 | /articles/xapo-architecture-experience.html | 12477 | 649 | 2023-07-18 | post |
107 | /articles/itsNotJustStandingUp.html | 12446 | 1072 | 2016-02-21 | post |
108 | /tags/application%20architecture.html | 12429 | 1290 | other | |
109 | /bliki/BranchByAbstraction.html | 12401 | 1014 | 2014-01-07 | post |
110 | /articles/lean-inception/ | 12291 | 1007 | 2017-03-15 | post |
111 | /bliki/CodeSmell.html | 11922 | 992 | 2006-02-09 | post |
112 | /videos.html | 11859 | 1014 | 2015-03-02 | nav |
113 | /articles/bitemporal-history.html | 11518 | 858 | 2021-04-07 | post |
114 | /articles/demo-front-end.html | 11508 | 741 | 2023-08-23 | post |
115 | /bliki/InversionOfControl.html | 11451 | 936 | 2005-06-26 | post |
116 | /delivery.html | 11279 | 898 | nav | |
117 | /eaaCatalog/dataMapper.html | 11191 | 904 | other | |
118 | /articles/microservice-trade-offs.html | 10990 | 871 | 2015-07-01 | post |
119 | /bliki/PolyglotPersistence.html | 10680 | 873 | 2011-11-16 | post |
120 | /bliki/FlagArgument.html | 10623 | 828 | 2011-06-23 | post |
121 | /articles/domain-oriented-observability.html | 10252 | 765 | 2019-04-02 | post |
122 | /bliki/ContinuousDelivery.html | 10164 | 850 | 2013-05-30 | post |
123 | /books/dsl.html | 10123 | 795 | other | |
124 | /bliki/LocalDTO.html | 9909 | 825 | 2004-10-21 | post |
125 | /books/patterns-distributed.html | 9781 | 4890 | other | |
126 | /bliki/SelfTestingCode.html | 9498 | 652 | 2014-05-01 | post |
127 | /articles/architect-elevator.html | 9442 | 746 | 2017-05-24 | post |
128 | /articles/patterns-of-distributed-systems/lamport-clock.html | 9430 | 804 | 2021-06-23 | post |
129 | /articles/gateway-pattern.html | 9428 | 766 | 2021-08-10 | post |
130 | /eaaDev/PresentationModel.html | 9214 | 770 | 2004-07-19 | post |
131 | /bliki/ParallelChange.html | 9059 | 734 | 2014-05-13 | post |
132 | /articles/nonDeterminism.html | 8970 | 732 | 2011-04-14 | post |
133 | /articles/collection-pipeline/ | 8920 | 670 | 2014-07-21 | post |
134 | /bliki/OrmHate.html | 8824 | 577 | 2012-05-08 | post |
135 | /articles/access-refactoring-web-edition.html | 8703 | 719 | 2018-11-21 | post |
136 | /bliki/TolerantReader.html | 8529 | 716 | 2011-05-09 | post |
137 | /articles/replaceThrowWithNotification.html | 8507 | 696 | 2014-12-09 | post |
138 | /bliki/ShuHaRi.html | 8475 | 714 | 2014-08-22 | post |
139 | /articles/is-tdd-dead/ | 8423 | 706 | 2014-05-19 | post |
140 | /articles/patterns-of-distributed-systems/clock-bound.html | 8397 | 385 | other | |
141 | /bliki/HumbleObject.html | 8383 | 494 | 2020-04-29 | post |
142 | /bliki/DesignStaminaHypothesis.html | 8322 | 678 | 2007-06-20 | post |
143 | /articles/agileFluency.html | 8301 | 670 | 2012-08-08 | post |
144 | /data/ | 8244 | 666 | nav | |
145 | /articles/patterns-of-distributed-systems/heartbeat.html | 8068 | 524 | 2020-08-04 | post |
146 | /bliki/FeatureBranch.html | 7966 | 670 | 2020-05-07 | post |
147 | /articles/newMethodology.html | 7814 | 636 | 2005-12-13 | post |
148 | /articles/exploring-mastodon.html | 7650 | 222 | 2022-11-01 | post |
149 | /articles/class-too-large.html | 7647 | 619 | 2020-04-14 | post |
150 | /tags/testing.html | 7606 | 645 | nav | |
151 | /bliki/FluentInterface.html | 7532 | 698 | 2005-12-20 | post |
152 | /bliki/MaturityModel.html | 7521 | 568 | 2014-08-26 | post |
153 | /bliki/ExtremeProgramming.html | 7484 | 615 | 2013-07-11 | post |
154 | /bliki/BeckDesignRules.html | 7461 | 636 | 2015-03-02 | post |
155 | /articles/web-security-basics.html | 7448 | 622 | 2016-01-28 | post |
156 | /articles/data-mesh-accelerate-workshop.html | 7435 | 295 | 2023-01-05 | post |
157 | /bliki/MicroservicePrerequisites.html | 7342 | 564 | 2014-08-28 | post |
Total is the total number of page views for that path in 2023. Monthly median is the median views for a month. Launch date is the is the date the article was first published. Not all pages have a launch date, often because navigation pages don't really get this kind of launch, but also because some older articles don't have publication dates assigned to them (on my todo list to fix, as I examine the place of old articles). For more on the type see the section below.
Mostly the total and monthly median track each other, but the cases where they don't are interesting, and I'll want to look at those.
Different types of pages
There are a few different types of pages on the site. I can classify them into three broad groups:
- posts: these are the core content of site, full length articles, posts in the bliki section, infodecks, and some older pattern-oriented material.
- navigation page (nav): pages used primarily to help the reader find
posts. These are the guide pages (eg
/agile.html
), including the home page, and the tags pages. - catalog pages are pages that are summaries of material that isn't on the site, such as the refactoring catalog, books pages, etc.
There are a few pages (eg faq.html
that don't quite fit
into any of these, for this analysis I'm clumping them in with catalog
pages.
The proportion of the overall traffic to the page looks like this.
Most of the traffic to site is to post pages, which are the ones with the real content of the site. But we can get more information by plotting the cumulative frequency curves for each of the three types.
How much does traffic vary with age
I've been working on this site for a long time, starting it in the late 1990's, not long before I joined Thoughtworks. Many of the articles, thus go back a long way. When thinking about the age of pages on the site, it doesn't make much sense to think about navigation pages, so I concentrate on the posts: articles and bliki entries. Here's a plot of posts made each year.
The date information on the very oldest material is rather fuzzy. Some posts, particularly bliki posts, don't have a date in the source. I can figure these out by looking at version control, but before 2004 I was using CVS, and didn't feel motivated to go back to CVS files to figure out the real commit dates, since they didn't survive the transition to Subversion, Mercurial, and git.
Recall the earlier graph that shows how an article's traffic drops exponentially over time. Given this, we would assume that most older articles don't generate as much traffic an newer ones, indeed I've heard anecdotally that this is commonly the case, which is they so many sites bombard us with lots of new material. I've always preferred to concentrate on material with longer lasting value, how does this work in terms of traffic?
One way to assess this is to look at articles that get a lot of traffic. In the past I've formulated “evergreen” articles as articles that have generated at least 1000 monthly pageviews for at least six months in a year. I prefer 6 months of high monthly traffic to a high yearly total, because if an article gets suddenly noticed it can generate lots of page views for just a month or two, but not sustain it.
Here's the above plot, but this time marking out these evergreen articles (in green, of course).
This shows that plenty of older articles get this kind of traffic. But what proportion of traffic goes to older articles? To get a sense of the traffic of individual articles, we can plot each post's median monthly traffic for 2023 against its publication date.
To understand what proportion of traffic goes to older articles, we can make a plot of cumulative traffic.
This traffic analysis is based on traffic for November 2023. I didn't use January 2024, because that month included publishing a rewrite of the continuous integration article, a very popular article originally written in 2006. Before the rewrite it generated a couple of thousand views per month, but when the rewrite was published it generated over thirty thousand views. I felt this would distort the picture. I then didn't use December 2023, since Decembers suffer from lower traffic due to the Christmas and New Year break.
This reads in a similar way to the earlier cumulative plots. By following the 30% proportion line, we can see that 30% of the traffic comes to posts written in the last three years.
proportion of pages | age |
---|---|
9% | 0y 0m |
13% | 0y 0m |
17% | 0y 4m |
20% | 0y 7m |
25% | 1y 10m |
32% | 2y 11m |
35% | 3y 3m |
40% | 4y 3m |
46% | 4y 6m |
51% | 5y 9m |
55% | 7y 6m |
60% | 8y 5m |
68% | 9y 8m |
71% | 9y 10m |
75% | 10y 6m |
81% | 12y 4m |
85% | 13y 8m |
91% | 17y 1m |
96% | 19y 5m |
100% | 23y 4m |
The rows here are roughly every 5%, but don't hit exactly the 5s because a single article can generate more than one percentage point of traffic, and thus cross the lines (something that the graph reveals). But it still tells us the 51% of post traffic comes to articles 5 years 9 months or younger.
This makes it clear how much the old articles are still read by readers today. Posts that are over five years old get about half the traffic that comes to posts on the site. Although that's the main thing to note here I should not forget that younger posts do grab a big hunk, 13% of traffic came from brand new articles.
Significant Revisions
21 March 2024: published up to “A small amount of articles get most of the traffic”
01 February 2024: started drafting