vipulnaik

One particularly amusing bug I was involved with was with an early version of the content recommendation engine at the company I worked at (this is used by websites to recommend related content on the website, such as related videos, articles, etc.). One of the customers for the recommendation engine was a music video service, and we/they noticed that One Direction's song called Infinity was showing up at the top of our recommendations a little too often. (I think this was triggered by the release of another One Direction song bringing the Infinity song into circulation, but I don't remember what that other song was).

It turned out this was due to a bug where we were taking a dot product of feature values with feature weights, where the feature value was being cast from a string to a numeric, with a fallback to zero if it was non-numeric, and then multiplied by the feature weight. For the "song title" feature, the feature weight was zero, and the feature value was anyway non-numeric, but even if it were numeric, it shouldn't matter, because anything times zero is ... zero, right? But the programming language in question treated "Infinity" as a numeric value, and it defined Infinity * 0 to be NaN (not a number) [ETA: A colleague who was part of the discovery process highlights that this behavior is in fact part of the IEEE 754 standard, so it would hold even for other programming languages that were compliant with the standard]. And NaN + anything would still be NaN, so the dot product would be NaN. And the way the sorting worked, NaN would always rank on top, so whenever the song got considered for recommendation it would rank on top.

Comment by VipulNaik on AI #41: Bring in the Other Gemini · 2023-12-09T00:49:16.605Z · LW · GW

He points to recent events at Sports Illustrated. But to me the SI incident was the opposite. It indicated that we cannot do this yet. The AI content is not good. Not yet. Nor are we especially close. Instead people are using AI to produce garbage that fools us into clicking on it. How close are we to the AI content actually being as good as the human content? Good question.

I happen to work at the company (The Arena Group) that operates Sports Illustrated. I wasn't involved with any of this stuff directly (and may not even have been at the company over some of the relevant time period), and obviously I speak only for myself (and will restrict myself to the public information available on this which is anyway most of what I know). With that said: most of the media coverage of this issue was pretty bad (as it is for most issues, but then, Gell-Mann Amnesia ...). In particular, the Futurism article has a lot of issues. The best article I found on this was https://www.theverge.com/2023/11/27/23978389/sports-illustrated-ai-fake-authors-advon-commerce-gannett-usa-today that properly explains who the vendor in question was and the vendor's track record on another site. Overall, the Verge article helps check out the Arena Group's official statement that the articles were from AdVon, despite the seeming reluctance of Futurism (and other publications) to accept that.

The maybe-AI-maybe-not-AI content was written way way before the whole ChatGPT thing raised the profile of AI. Futurism links to https://web.archive.org/web/20221004090814/https://www.si.com/review/full-size-volleyball/ which is a snapshot of the article from October 2022, before the wide release of ChatGPT, and well before The Arena Group made public announcements about experimenting with AI. But in fact the article is even way older than October 2022; the oldest snapshot https://web.archive.org/web/20210916150227/https://www.si.com/review/full-size-volleyball/ of the article is from September 16, 2021, and the article publication date is September 2, 2021. Way before ChatGPT or any of the AI hype. And even that earliest version shows the same photo and a similar bio.

So to my mind this comes down to the use of a vendor (AdVon) with shady practices, and either the lack of due diligence in vendor selection or not caring about the details of their practices as long as they're driving revenue to the site and not hurting the site's brand. The reason to use a vendor like this is simply that they drive affiliate marketing revenue (people find the recommended content interesting, they click on it, maybe buy it, everybody gets a cut of the revenue, everybody is happy). This simply isn't even part of the editorial content and basically has nothing to do with replacing real writers of sports content with AI writers -- it's simply an effort to leverage the brand of the site by running a side business from one corner of it. Also, to the extent it is or isn't ethical, the issue probably has more to do with whether the reviews were genuine rather than whether the authors were human or AI -- even if the authors were human, if the reviews were fraudulent, it would be a problem in equal measure. So overall I think the vendor selection was problematic, but this has little to do with AI.

Separately, many sports sites, including Sports Illustrated, have used automatically (/ "AI")-generated content for routine content such as game summaries, e.g., using the services of Data Skrive: https://www.si.com/author/data-skrive -- this is probably a little closer to the idea of replacing human writers, but the kind of content being created is pretty much the kind of content that humans wouldn't want to spend time creating.

The Arena Group has done some AI experimentation with the goal of trying to use AI-like tools to write normal content (not things like game summaries), as Futurism critiqued at https://futurism.com/neoscope/magazine-mens-journal-errors-ai-health-article but this AdVon thing is completely separate in time, in space, and in purpose.

Comment by VipulNaik on ChatGPT 4 solved all the gotcha problems I posed that tripped ChatGPT 3.5 · 2023-11-30T17:16:42.218Z · LW · GW

Thanks for reviewing and catching these subtle issues!

Technically, it is not true that the prime numbers being multiplied need to be distinct. For example, 2*2=4 is the product of two prime numbers, but it is not the product of two distinct prime numbers.

Good point, I've marked this as an error. My prompt about gcd did specify distinctness but the prompt about product did not, so this is indeed an error.

This seems wrong: "neither can be definitively identified" makes it sound like they exist but just can't be identified...

I passed on this one as being too minor to mark.

Safe primes area subset of Sophie Germain primes

Not true, e.g. 7 is safe but not Sophie Germain.

Good point; I missed reading this sentence originally. I've marked this one as well.

Comment by VipulNaik on ChatGPT 4 solved all the gotcha problems I posed that tripped ChatGPT 3.5 · 2023-11-30T16:10:51.640Z · LW · GW

Thanks, fixed now! Sorry I missed that.

Comment by VipulNaik on ChatGPT 4 solved all the gotcha problems I posed that tripped ChatGPT 3.5 · 2023-11-30T16:05:37.044Z · LW · GW

Good question!

First, my original post didn't provide the correct answers for most questions, only what was wrong with ChatGPT. Going from knowing what was wrong to actually giving correct answers seems like quite a leap. Further, ChatGPT changed its answers to better ones (including more rigorous explanations) even in cases where its original answers were correct.

Second, ChatGPT's self-reported training data cutoff date at the time it was asked these questions was September 2021 or January 2022. To my knowledge, Issa didn't ask it this question, but sources like https://www.reddit.com/r/ChatGPT/comments/16m6yc7/gpt4_training_cutoff_date_is_now_january_2022/ suggest that it was September 2021 at the time of his sessions, then became January 2022. So, the blog post, published in December 2022, should not have been part of its training data.

With that said, the sessions themselves (not the blog post about them) might have been part of the feedback loop, but in most cases I did not tell ChatGPT what was wrong with its answers within the session itself.

Comment by VipulNaik on ChatGPT 4 solved all the gotcha problems I posed that tripped ChatGPT 3.5 · 2023-11-30T06:27:02.745Z · LW · GW

Thanks, and sorry I missed that error. I've updated the post by bolding the error, and also HT'ed your contribution.

Comment by VipulNaik on riceissa's Shortform · 2023-10-02T19:39:27.946Z · LW · GW

Regarding the topic of batteries getting better while also being harder to remove/replace, I found these interesting comments by FeRD (https://www.ifixit.com/Wiki/What_to_do_with_a_swollen_battery?permalink=comment-911590#comment-911590 + the next two comments) that I quote in full below:

(1/2) I doubt it's entirely fair to blame Apple for this; they may not have even been the first manufacturer to use a non-removable battery pack. They were certainly the first biggest company to make that switch, but plenty of others did the same, and in some cases way too quickly for it to have been a case of them "copying" Apple. They simply came to the same conclusion: Removable batteries, like physical keyboards, hurt sales more than they helped. So, out they went.

Because, the frustration of it is that device manufacturers genuinely have really "good" reasons why batteries are no longer removable. Actually, at least two good reasons: Water-resistance and miniaturization.

Today's phones are surprisingly watertight, to the point where many can survive a brief dunk with no immediate ill effects. (Though I personally suspect that water infiltration is a trigger for subsequent battery swelling.) My Galaxy Note8 once survived a complete submersion lasting 2-3 seconds. Don't try that with your Nokia 3310!

FeRD - Sep 21, 2023

(2/2) All battery-powered devices are also continually getting smaller and smaller. (Or, failing that, they're packing more and more stuff into roughly the same amount of space.) And the fact is, removable batteries require MUCH more space than non-removable.

If a battery is removable, it has to have an outer, protective case of its own due to the dangerous chemicals inside. The phone would then also have to have mechanisms to align and secure the battery, a latch and release mechanism, and electrical contacts between what are now (effectively) two completely separate devices. That all takes up space.

A removable battery makes a device significantly larger (in particular, thicker), or else it has half the capacity of the non-removable design. Either way, 999 out of 1000 consumers will choose the smaller, thinner, lighter fixed-battery device with twice the runtime between charges, over a bigger, thicker, doesn't-last-as-long alternative with a removable battery.

FeRD - Sep 21, 2023

Actually, now that I think about it there's a third reason that's even more damning:

Removable batteries were never about extending device lifetime.

Manufacturers will tell you, and they can provide reams of consumer data to back it up: The percentage of consumers who keep a device long enough to wear out the first battery is TINY. Laughably tiny. The overwhelming majority of mobile-device owners want to replace their device with a newer, faster one every 2 years or less — long before the battery is even starting to degrade. (After all, until very recently the technology was advancing so quickly, a 2-year-old phone was nigh-unusable, given its limitations compared to newer models.)

Removable batteries were always intended for power-users who needed more runtime than they could get from a single battery. They'd own two+, and swap them out as needed (charging externally). In the end, rapid charging, larger capacities, and improved power-management software provided a better solution to that problem.

FeRD - Sep 21, 2023

Comment by VipulNaik on riceissa's Shortform · 2023-08-13T18:59:05.500Z · LW · GW

A few thoughts:

In at least one area, namely cars, durability relative to actual usage has improved a lot over the past 50 years or so. See for instance https://en.wikipedia.org/wiki/Car_longevity "According to the New York Times, in the 1960s and 1970s, the typical car reached its end of life around 100,000 miles (160,000 km), but due to manufacturing improvements in the 2000s, such as tighter tolerances and better anti-corrosion coatings, the typical car lasts closer to 200,000 miles (320,000 km)."
The area where I'm most aware of claims of reduced durability is home appliances e.g. https://ryanfinlay.medium.com/they-used-to-last-50-years-c3383ff28a8e but I think there are a bunch of factors here that make it a little tricky given (a) low cost: there's a much wider selection of home appliances, and the low end that are quite cheap still last several years, which is obviously less than the high end and the older great devices, but probably good enough and a great comparison to costs. Low-end refrigerators for instance cost only a bit more than phones, which is remarkable considering the size differences! (b) energy use as a major component of cost over the long term: for appliances like refrigerators, the electricity use becomes a major cost component if the appliance lasts too long, so that having a long-lasting refrigerator that doesn't benefit from energy efficiency improvements may ultimately cost more.
In the realm of electronics, quality improvements in hardware even over the last 5-10 years have been very impressive; for instance, batteries have gotten better, design/form factors have gotten better. However, repairability in particular has gotten worse but this seems tied to the trend toward miniaturization and portability. If people anyway plan to replace their devices every few years, then portability probably wins over repairability.

Comment by VipulNaik on Why have exposure notification apps been (mostly) discontinued? · 2023-07-10T21:34:01.906Z · LW · GW

Hmm, aren't exposure notifications an opt-in program? I was never forced to get them -- I chose to download and install the app and keep it on. The same way I choose to allow Google Maps to keep a record of my physical location.

Comment by VipulNaik on To err is neural: select logs with ChatGPT · 2023-01-27T18:13:55.204Z · LW · GW

I added logs of two further ChatGPT sessions, one of which repeated many of the prompts I used here, tested against the 2023-01-09 version of ChatGPT: https://github.com/vipulnaik/working-drafts/commit/427f5997d48d78c69e3e16eeca99f0b22dc3ffd3

I had originally been thinking of formatting these into a blog post or posts, and I might still do so, but probably not for the next two months, so just sharing the raw logs for now so that people reading the comments on this post see my update.

Comment by VipulNaik on Preventing, reversing, and addressing data leakage: some thoughts · 2022-11-15T06:50:41.966Z · LW · GW

Good point -- I removed the tag!

Comment by VipulNaik on riceissa's Shortform · 2022-09-10T20:09:09.628Z · LW · GW

Somewhat related, though different in various ways, is this post by Bryan Caplan: https://www.econlib.org/the-cause-of-what-i-feel-is-what-i-do-how-i-eliminate-pain/

Comment by VipulNaik on Entitlement as a major amplifier of unhappiness · 2022-06-09T04:05:01.008Z · LW · GW

I'm curious to hear examples of other worthwhile things in the direction that you have in mind!

Comment by VipulNaik on The case for "mental strength" · 2022-02-09T00:30:28.776Z · LW · GW

Thanks -- good points and well-presented with precision and flair!

Comment by VipulNaik on The case for "mental strength" · 2022-02-09T00:29:31.230Z · LW · GW

Good point! It could be that both kinds of mental exercise (excess stimulation and lack of stimulation) are important for building mental strength; modern society provides the former in abundance (and particularly so for LessWrong readers!), so the form of exercise we're constrained on is the lack-of-stimulation kind (and that's where meditation helps). How far-fetched does that sound?

Comment by VipulNaik on Risk and Safety in the age of COVID · 2022-02-01T22:46:05.816Z · LW · GW

An anonymous friend to whom I sent this post writes:

He has a good point that most people just want to do the universal “safety” precautions. I think a big reason that he doesn’t mention is that reasonable precautions are how all businesses defend themselves from lawsuits (e.g. sexual harassment and DEI training); as long as they take the reasonable precautions, then they are immune from lawsuits. But I don’t buy “safety” as an explanation for what policies are possible. It sounds like a just-so story for why we are in the mess that we are in. Vaccines are available and thus a reasonable precaution because the government subsidized them. Respirators and HVAC filters could have been a reasonable precaution if governments had subsidized and encouraged them. I think better leadership could have made a difference.

I wish the federal government made it patriotic to make PPE and safety retrofits. They should have announced programs to set up melt blown respirator manufacturing in America. They should have made a tax credit (like the solar tax credit) for retrofits that improve ventilation in retail or offices.

Comment by VipulNaik on Quick Poll: Booster Reactions · 2021-12-25T02:02:59.774Z · LW · GW

I just got my booster dose today (December 24) and intend to monitor closely. I'll be regularly updating https://github.com/vipulnaik/diet-exercise-health/blob/master/notes/2021-12-24-pfizer-covid-vaccine-booster-dose.md with temperature readings and subjective details of my experience.

I did similar logging after the second dose, that you can see here: https://github.com/vipulnaik/diet-exercise-health/blob/master/notes/2021-06-25-pfizer-covid-vaccine-dose-2.md

Comment by VipulNaik on Chris Voss negotiation MasterClass: review · 2021-11-30T02:45:17.536Z · LW · GW

By "this way" do you mean the way I wrote it or the way Alexei would have preferred?

Comment by VipulNaik on Chris Voss negotiation MasterClass: review · 2021-11-25T16:40:43.532Z · LW · GW

Good point; I added a section clarifying this focus: https://www.lesswrong.com/posts/CRAzG386t3suSqDgd/chris-voss-negotiation-masterclass-review#Low_level_execution_focus_rather_than_domain_specific_tactical_or_business_school_style_strategy_focus

Comment by VipulNaik on Chris Voss negotiation MasterClass: review · 2021-11-25T04:58:08.909Z · LW · GW

Thanks for the feedback! It seems like you're saying I should first have done "negotiation techniques" then "do these negotiation techniques have a place in rational discourse?" as separate sections. So if we make a table with rows as techniques and columns as lenses, then I should have traversed it column major instead of row major.

Did I misunderstand or miss an angle to what you're saying?

Comment by VipulNaik on Chris Voss negotiation MasterClass: review · 2021-11-25T04:50:51.484Z · LW · GW

Good point! Voss talks a bit about how many of these techniques feel odd. Two points he makes:

Practice in low-stakes situations to get more comfortable with it. Don't try any negotiation technique in a high-stakes situation that you don't have practice with!
In many cases the discomfort you experience saying it isn't noticed by others. Voss gives examples related to mirroring as well as to the calibrated question "How am I supposed to do that?" People feel apprehensive asking the question but it usually works despite their apprehension.

I would also add that it's more important to stick to things you believe in than to try to literally apply something that you feel is bad or wrong. If you're convinced that, in a given situation, a label of "it sounds like you're very happy with the way this turned out" is a gaming of the other person, don't use it. But if in a situation you think it's actually an accurate label that helps summarize the situation and correctly shows the other person that you are tuned in to what they are feeling and expressing, do it! Just keep an open mind to the possibility of using labels.

Summary (added): Basically I think if you use low-stakes practice and only selectively apply to the real world the skills you are comfortable with, you don't need to experience an intermittent dip in effectiveness due to not feeling authentic.

Comment by VipulNaik on Timeline of AI safety · 2021-02-09T00:48:43.158Z · LW · GW

We cover a larger period in the overall summary and full timeline. The summary by year starts 2013 because (it appears that) that's around the time that enough started happening per year. Though we might expand it a little further to the past as we continue to expand the timeline.

Comment by VipulNaik on [deleted post] 2021-01-07T14:07:12.861Z

<describe lockdowns as social engineering>

Did you intend to expand this?

<Michael Mina stuff here>

Did you intend to expand this?

Comment by VipulNaik on [Announcement] LessWrong will be down for ~1 hour on the evening of April 10th around 10PM PDT (5:00AM GMT) · 2020-04-09T05:38:46.827Z · LW · GW

Do you mean PDT instead of PST in the title?

Comment by VipulNaik on Coronavirus: California case growth · 2020-04-02T02:01:03.130Z · LW · GW

I did some rewording of the post that made it a little more wordy, but fingers crossed that that part has now become less confusing.

Comment by VipulNaik on Coronavirus: California case growth · 2020-04-02T01:59:31.824Z · LW · GW

Thank you for the feedback (and also for discussing this at length which gave me better understanding of the nuances). I modified to a more clumsy but hopefully a more what-you-see-is-what-I-mean term: https://www.lesswrong.com/posts/mRkWTpH9mb8Wdpcn5/coronavirus-california-case-growth?commentId=GHSEwZwR2TSkyzpdm

Comment by VipulNaik on Coronavirus: California case growth · 2020-04-02T01:57:28.549Z · LW · GW

Thank you for the feedback. I agree with Lukas Gloor's reply below that the choice of term is confusing as it differs from what people may intuitively think "true cases" means. I also agree with his remark that setting terminology that is consistent with reality isn't bad in and of itself.

I have therefore changed "true cases" to "true currently-or-eventually-symptomatic cases". I think that provides the level of precision needed for our purposes. I haven't found a better term after some searching (though not a lot); however, I'm happy to change to a more concise and medically accepted term if I get to learn of one.

Comment by VipulNaik on Coronavirus: California case growth · 2020-03-30T02:17:17.013Z · LW · GW

What I wrote there was assuming that the number of new true cases drops to a fairly low level. Whether that happens now or a week or two or three later is unclear; if the 2 -> 3 backlog is growing. then resolving that backlog will add more delay.

I posited us already being at this point as the "optimistic" scenario.

I'll reword the post to clarify this.

Comment by VipulNaik on Introducing Foretold.io: A New Open-Source Prediction Registry · 2019-10-19T14:41:28.912Z · LW · GW

Directly visiting http://foretold.io gives an ERR_NAME_NOT_RESOLVED. Can you make it so that foretold.io redirects to www.foretold.io?

Comment by VipulNaik on The why and how of daily updates · 2019-05-06T13:48:07.602Z · LW · GW

That's a normal part of life :). Any things that I decide to do in a future day, I'll copy/paste to over there, but I usually won't delete the items from the checklist for the day where I didn't complete them (thereby creating a record of things I expected or hoped to do, but didn't).

For instance, at https://github.com/vipulnaik/daily-updates/issues/54 I have two undone items.

Comment by VipulNaik on Raemon's Shortform · 2018-07-29T20:13:53.440Z · LW · GW

There is some related stuff by Carl Shulman here: https://www.greaterwrong.com/posts/QSHwKqyY4GAXKi9tX/a-personal-history-of-involvement-with-effective-altruism#comment-h9YpvcjaLxpr4hd22 that largely agrees with what I said.

Comment by VipulNaik on Raemon's Shortform · 2018-07-16T05:11:18.628Z · LW · GW

My understanding is that Against Malaria Foundation is a relatively small player in the space of ending malaria, and it's not clear the funders who wish to make a significant dent in malaria would choose to donate to AMF.

One of the reasons GiveWell chose AMF is that there's a clear marginal value of small donation amounts in AMF's operational model -- with a few extra million dollars they can finance bednet distribution in another region. It's not necessarily that AMF itself is the most effective charity to donate to to end malaria -- it's just the one with the best proven cost-effectiveness for donors at the scale of a few million dollars. But it isn't necessarily the best opportunity for somebody with much larger amounts of money who wants to end malaria.

For comparison:

In its ~15-year existence, the Global Fund says it has disbursed over $10 billion for malaria and states that 795 million insecticide-treated nets were funded (though it's not clear if these were actually funded all through the 10 billion disbursed by the Global Fund). It looks like their annual malaria spend is a little under a billion. See https://www.theglobalfund.org/en/portfolio/ for more. The Global Fund gets a lot of its funding from governments; see https://timelines.issarice.com/wiki/Timeline_of_The_Global_Fund_to_Fight_AIDS%2C_Tuberculosis_and_Malaria for more on their history and programs.
The Gates Foundation spends ~$3.5 billion annually, of which $150-450 million every year is on malaria. See https://donations.vipulnaik.com/donor.php?donor=Bill+and+Melinda+Gates+Foundation#donorDonationAmountsBySubcauseAreaAndYear and https://donations.vipulnaik.com/donor.php?donor=Bill+and+Melinda+Gates+Foundation&cause_area_filter=Global+health%2Fmalaria#donorDonationAmountsByDoneeAndYear They're again donating to organizations with larger backbones and longer histories (like PATH, which has over 10,000 people, and has been around since 1978), that can absorb large amounts of funding, and the Gates Foundation seems more cash-constrained than opportunity-constrained.

The main difference I can make out between the EA/GiveWell-sphere and the general global health community is that malaria interventions (specifically ITNs) get much more importance in the EA/GiveWell-sphere, whereas in the general global health spending space, AIDS gets more importance. I've written about this before: http://effective-altruism.com/ea/1f9/the_aidsmalaria_puzzle_bleg/

Comment by VipulNaik on Should we be spending no less on alternate foods than AI now? · 2017-10-31T05:35:43.651Z · LW · GW

I tried looking in the IRS Form 990 dataset on Amazon S3, specifically searching the text files for forms published in 2017 and 2016.

I found no match for (case-insensitive) openai (other than one organization that was clearly different, its name had openair in it). Searching (case-insensitive) "open ai" gave matches that all had "open air" or "open aid" in them. So, it seems like either they have a really weird legal name or their Form 990 has not yet been released. Googling didn't reveal any articles of incorporation or legal name.

Comment by VipulNaik on Writing That Provokes Comments · 2017-10-04T16:45:17.312Z · LW · GW

In my experience, writing full-fledged, thoroughly researched material is pretty time-consuming, and if you push that out to the audience immediately, (1) you've sunk a lot of time and effort that the audience may not appreciate or care about, and (2) you might have too large an inferential gap with the audience for them to meaningfully engage.

The alternative I've been toying with is something like this: when I'm roughly halfway through an investigation, I publish a short post that describes my tentative conclusions, without fully rigorous backing, but with (a) clearly stated conclusions, and (b) enough citations and other signals that there's decent research backing my process. Then I ask people what they think of the thesis, which parts they are interested in, and what they are skeptical of. Then after I finish the rest of the investigation I push a polished writeup only for those parts (for the rest, it's just informal notes + general pointers).

For examples, see https://www.lesserwrong.com/posts/ghBZDavgywxXeqWSe/wikipedia-pageviews-still-in-decline and http://effective-altruism.com/ea/1f9/the_aidsmalaria_puzzle_bleg/ (both are just the first respective steps for their projects).

I feel like this both makes comments more valuable to me and gives more incentive to commenters to share their thoughts, but the jury is still out.

Comment by VipulNaik on Wikipedia pageviews: still in decline · 2017-09-30T14:50:26.209Z · LW · GW

FWIW, my impression is that data on Wikipedia has gotten somewhat more accurate over time, due to the push for more citations, though I think much of this effect occurred before the decline started. I think the push for accuracy has traded off a lot against growth of content (both growth in number of pages and growth in amount of data on each page). These are crude impressions (I've read some relevant research but don't have strong reason to believe that should be decisive in this evaluation) but I'm curious to hear what specific impressions you have that are contrary to this.

Comment by VipulNaik on Wikipedia pageviews: still in decline · 2017-09-30T14:46:35.195Z · LW · GW

If you have more fine-grained data at your disposal on different topics and how much each has grown or shrunk in terms of number of pages, data available on each page, and accuracy, please share :).

Comment by VipulNaik on Wikipedia pageviews: still in decline · 2017-09-30T04:05:04.345Z · LW · GW

In the case of Wikipedia, I think the aspects of quality that correlate most with explaining pageviews are readily proxied by quantity. Specifically, the main quality factors in people reading a Wikipedia page are (a) the existence of the page (!), (b) whether the page has the stuff they were looking for. I proxied the first by number of pages, and the second by length of the pages that already existed. Admittedly, there are a lot more subtleties to quality measurement (which I can go into in depth at some other point) some of which can have indirect, long-term effects on pageviews, but on most of these dimensions Wikipedia hasn't declined in the last few years (though I think it has grown more slowly than it would with a less dysfunctional mod culture, and arguably too slowly to keep pace with the competition).

Comment by VipulNaik on Wikipedia pageviews: still in decline · 2017-09-29T04:51:46.927Z · LW · GW

Great point. As somebody who has been in the crosshairs of Wikipedia mods (see ANI) my bias would push me to agree :). However, despite what I see as problems with Wikipedia mod culture, it remains true that Wikipedia has grown quite a bit, both in number of articles and length of already existing articles, over the time period when pageviews declined. I suspect the culture is probably a factor in that it represents an opportunity cost: a better culture might have led to an (even) better Wikipedia that would not have declined in pageviews so much, but I don't think the mod culture led to a quality decline per se. In other words, I don't think the mechanism:

counterproductive mod culture -> quality decline -> pageview decline

is feasible.

Comment by VipulNaik on Wikipedia pageviews: still in decline · 2017-09-28T05:03:43.539Z · LW · GW

Great points. As I noted in the post, search and social media are the two most likely proximal mechanisms of causation for the part of the decline that's real. But neither may represent the "ultimate" cause: the growth of alternate content sources, or better marketing by them, or changes in user habits, might be what's driving the changes in social media and search traffic patterns (in the sense that the reason Google's showing different results, or Facebook is making some content easier to share, is itself driven by some combination of what's out there and what users want).

The main challenge with search engine ranking data is that (a) the APIs forbid downloading the data en masse across many search terms, and (b) getting historical data is difficult. Some SEO companies offer historical data, but based on research Issa and I did last year, we'd have to pay a decent amount to even be able to see if the data they have is helpful to us, and it may very well not be.

The problem with Google Trends is that (a) it does a lot of normalization (it normalizes search volume relative to total search volume at the time), which makes it tricky to interpret data over time, and (b) it's hard to download data en masse. Also, a lot of Google Trends results are just amusingly weird, e.g. https://trends.google.com/trends/explore?date=all&q=Facebook (see https://www.facebook.com/vipulnaik.r/posts/10208985033078964 for more discussion)-- are we really to believe that interest in Facebook spiked in October 2012, and that it has returned in 2017 (after a 5-year decline) to what it used to be back in 2009? Google Trends is just yet another messy data series that I would have to acquire expertise in the nuances of, not a reliable beacon of truth against which Wikipedia data can be compared.

The one external data source I have been able to collect with reasonable reliability is Facebook share counts. At the end of each month, I record Facebook share counts for a number of Wikipedia pages by hitting the Facebook API (a process that takes several days because of Facebook's rate limiting). Based on this I now have decent time series of cumulative Facebook share counts, such as https://wikipediaviews.org/displayviewsformultiplemonths.php?tag=Colors&allmonths=allmonths-api&language=en&drilldown=cumulative-facebook-shares If I do a more detailed analysis, this data will be important for evaluating the social media hypothesis.

How interested are you in seeing an exploration of the search engine ranking and increased use of social media hypotheses?

Comment by VipulNaik on Beta - First Impressions · 2017-09-27T18:42:59.543Z · LW · GW

This is already an issue: https://github.com/Discordius/Lesswrong2/issues/168

Comment by VipulNaik on Wikipedia pageviews: still in decline · 2017-09-27T15:10:27.060Z · LW · GW

The Wikimedia Foundation has not ignored the decline. For instance, they discuss the overall trends in detail in their quarterly readership metrics reports, the latest of which is at https://commons.wikimedia.org/wiki/File:Wikimedia_Foundation_Readers_metrics_Q4_2016-17_(Apr-Jun_2017).pdf The main difference between what they cover and what I intend to cover are (a) they only cover overall rather than per-page pageviews, (b) they focus more on year-over-year comparisons than long-run trends, (c) related to (b), they don't discuss the long-run causes. However, these reports are a great way of catching up on incremental overall traffic level updates as well as any analytics or measurement discrepancies that might be driving weird numbers.

The challenge of raising more funds with declining traffic has also been noted in fundraiser discussions, such as at https://en.wikipedia.org/wiki/Wikipedia:Wikipedia_Signpost/2015-10-14/News_and_notes which has the quote:

Better performing banners are required to raise a higher budget with declining traffic. We’ll continue testing new banners into the next quarter and sharing highlights as we go.

Comment by VipulNaik on Wikipedia pageviews: still in decline · 2017-09-27T01:43:49.377Z · LW · GW

They still show up in the total comment count :).

Comment by VipulNaik on Wikipedia pageviews: still in decline · 2017-09-26T23:04:47.956Z · LW · GW

Comment by VipulNaik on LessWrong analytics (February 2009 to January 2017) · 2017-04-18T21:25:40.463Z · LW · GW

For all the talk about the "decline" of LessWrong, total pageviews and sessions to LessWrong have stayed 5-10 times higher than those to the Effective Altruism Forum (the EAF numbers are documented in my post).

Comment by VipulNaik on Wikipedia usage survey results · 2017-03-17T22:00:37.860Z · LW · GW

The 2017 SSC Survey had 5500 respondents. Presumably this survey was more widely visible and available than mine (which was one link in the middle of a long link list).

https://slatestarcodex.com/2017/03/17/ssc-survey-2017-results/

Comment by VipulNaik on Wikipedia usage survey results · 2017-01-07T23:58:42.498Z · LW · GW

Varies heavily by context. Typical alternatives:

(a) Google's own answers for simple questions.

(b) Transactional websites for search terms that denote possible purchase intent, or other websites that are action-oriented (e.g., Yelp reviews).

(c) More "user-friendly" explanation sites (e.g., for medical terminology, a website that explains it in a more friendly style, or WikiHow)

(d) Subject-specific references (some overlap with (c), but could also include domain Wikias, or other wikis)

(e) When the search term is trending because of a recent news item, then links to the news item (even if the search query itself does not specify the associated news)

Comment by VipulNaik on Wikipedia usage survey results · 2016-12-28T23:19:40.185Z · LW · GW

Interesting. I suspect that even among verbal elites, there are further splits in the type of consumption. Some people are heavy on reading books since they want a full, cohesive story of what's happening, whereas others consume information in smaller bits, building pieces of knowledge across different domains. The latter would probably use Wikipedia more.

Similarly, some people like opinion-rich material whereas others want factual summaries more. The factual summary camp probably uses Wikipedia more.

However, I don't know if there are easy ways of segmenting users, i.e., I don't know if there are websites or communities that are much more dominated by users who prefer longer content, or users who prefer factual summaries.

Comment by VipulNaik on Wikipedia usage survey results · 2016-12-26T05:22:30.690Z · LW · GW

Good idea, but I don't think he does the census that frequently. The most recent one I can find is from 2014: http://slatestarcodex.com/2015/11/04/2014-ssc-survey-results/

The annual LessWrong survey might be another place to consider putting it. I don't know who's responsible for doing it in 2017, but when I find out I'll ask them.

Comment by VipulNaik on Wikipedia usage survey results · 2016-12-25T15:00:31.486Z · LW · GW

It's not too late, if I do so decide :). In other words, it's always possible to spend later for larger samples, if that actually turns out to be something I want to do.

Right now, I think that:

It'll be pretty expensive: I'd probably want to spend using several different survey tools, since each has its strengths and weaknesses (so SurveyMonkey, Google Surveys, maybe Survata and Mechanical Turk as well). Then with each I'd need 1000+ responses to be able to regress against all variables and variable pairs. The costs do add up quickly to over a thousand dollars.
I don't currently have that much uncertainty: It might show that age and income actually do explain a little more of the variation than it seems right now (and that would be consistent with the Pew research). But I feel that we already have enough data to see that it doesn't have anywhere near the effect that SSC membership has.

I'm open to arguments to convince me otherwise.

User info

Posts

Comments