LW Open Source – Overview of the Codebase
post by Raemon · 2018-05-16T23:50:35.108Z · LW · GW · 3 commentsContents
The Ties That Bind ~/packages/lesswrong The lesswrong package structure The lib folder Up Next... None 3 comments
If you’re interested in making longterm, serious contributions to the LessWrong codebase, it’ll be helpful to know some of the gritty details of how the codebase fits together.
This is an overview of the file structure – some of the root files that bind the application together, and the most important folders of the lesswrong package in particular.
The Ties That Bind
Understanding the “glue” files that connect the app together.
After you’ve gotten a local server up and running, take a look at these files in the root directory:
package.json
– This is the npm manager. In addition to telling npm install which node packages to download, it has a new scripts that you’ll want to run the application
npm start
– runs the application using thesettings.json
. It will first trigger thenpm prestart
script, which will run theprestart_lesswrong.sh
shell script.npm test
– runs the automated tests. We haven’t actually come up with a testing framework that’s fast/good/reliable enough to be worth the overhead, so for now ignore this.
sample_settings.json
– These are the default settings, which doesn’t come with any passwords, or connections to our mongo database or galaxy server.
settings.json
– Automatically generated from sample_settings.json.
If you’re doing longterm development and are trusted by the site admins, we can send you credentials for the either the development or production server/database.
You’ll also see these folders:
node_modules/
– where all node modules are installed
packages/
– where all vulcanJS packages live.
~/packages/lesswrong
This is where most of our code lives.
Each vulcanJS package has a package.js file
, which initializes the package. Glancing inside packages/lesswrong/package.js, you’ll see:
Package.onUse
– determines what dependencies to load when the app is run normally. api.use calls other VulcanJS packages, which in turn call other dependencies. It also establishes server.js and client.js as the main modules.Package.onTest
– same, but for used in the testing environment
And then you’ll find:
server.js
– this is the main module that our app runs on the server. Essentially a giant list of import statements that plugs the various pieces of the package together.client.js
– same, but for the client (this is what gets loaded whenever a user loads a page on lesswrong.com). This basically just imports theindex.js
file from the lib folder.
One last layer of “glue” files exists inside the lib folder: the aforementioned index.js
, which in turn imports components.js
(as well as all other .js files)
If you've written some new files and your code isn't working, there's a good chance you forgot to import those files into components.js
or index.js
(depending on whether they're a React component or a general javascript file).
The lesswrong package structure
In the lesswrong package folder, you'll the following folders:
- assets – this is basically irrelevant to you
- components – where all React components live
- lib – where most javascript other than components lives
- styles – depecrated mostly; we used to style our components with sass (an extension for css that includes variables and mixins). We now style most components with jss, a javascript/css hybrid, with the styles declared directly in the component's .jsx file.
- testing – someday we dream that this folder will be useful to you, but as previously noted, we haven't yet found a testing framework that meets all our needs.
The lib folder
The lib folders contains some legacy files that aren't really in use (or are unlikely to be relevant). Folders that I do expect to be important to you are:
- collections – a folder for each mongo collection, and the associated schema, callbacks, permission and helper functions that go along with it. (I'll be covering this in more detail in a future post)
- editor – helper functions for the draftJS plugin (such as converting LaTeX and html)
- i18n-en-us – Some messages that we may eventually want to display in multiple languages (currently just including the english versions)
- legacy-redirects – Special routing for old lesswrong.com links (which are essentially deprecated, but maintained so that old links don't break)
- modules – A mix of code that didn't neatly fit into a more specific folder (in some cases the code was created before a more specific folder existed, and should probably be refactored)
- rss-integration – Callbacks and cron jobs associated with the rss feeds that are imported into lesswrong (i.e. some users have outside blogs that are automatically imported as LW posts via their rss feed). Note that additional code related to this is in the rssfeed collection.
- scripts – One-of scripts that we run periodically to import content from the old LW database, or to update documents when we change the schema.
- search – We use a tool called Algolia to manage our internal search engine. This folder contains callbacks and utils for updating the search engine as posts and comments are created.
- subscriptions – I haven't actually used this, and habryka is busy this month so I haven't checked in on this. Will update once I get a chance to talk to him.
The lib folder also contains some files. The one you'll most likely need to refer to is routes.js, which establishes the url routing for the site.
Up Next...
To really get started, you'll want to understand our collections and components folders – the bulk of our codebase. These are complex enough that I'll be covering them in a separate post.
3 comments
Comments sorted by top scores.
comment by Robert Miles (robert-miles) · 2021-10-26T20:15:01.938Z · LW(p) · GW(p)
Is there a public-facing API endpoint for the Algolia search system? I'd love to be able to say to my discord bot "Hey wasn't there a lesswrong post about xyz?" and have him post a few links
Replies from: habryka4↑ comment by habryka (habryka4) · 2021-10-27T03:25:15.780Z · LW(p) · GW(p)
Pretty sure you should just be able to copy the structure of the query from the Chrome network tab, and reverse engineer it this way. IIRC the structure was pretty straightforward, and the response pretty well structured.
Replies from: robert-miles↑ comment by Robert Miles (robert-miles) · 2021-10-27T10:43:44.293Z · LW(p) · GW(p)
Ah ok, thanks! My main concern with that is that it goes to "https://z0gr6exqhd-dsn.algolia.net", which feels like it could be a dynamically allocated address that might change under me?