Open Thoughts

Musings about open, semantic web tech by Matt Derocher

I am notorious for overthinking things. I have barely learned to swim because whenever I try to learn I overthink all the different parts of the process instead of just doing it. That is true about so many things in my life. Before I do something, I try to learn as much as I can about it.

While learning both CSS and JavaScript, I read books, did courses, listened to podcasts, and watch conference presentations about the languages. What I did not do much of: coding with the languages. I did not really play around or experiment. I did some, but it was small compared to how much time I spent passively learning. I wanted to make sure that I really understood a subject and would avoid any pitfalls that are common to new developers. But the folly in that way of learning is twofold:

  1. You can never fully learn something. In the cases of CSS and JavaScript, the two languages were morphing and growing while I was learning, so even if I learned all the existing stuff, it wouldn’t prepare me for the new stuff.

  2. There are so many nuances of any subject that only reveal themselves when you are in the thick of trying to do something. They are fringe problems that you would not think of learning about beforehand because you didn’t even know they existed.

I’d like to think I am getting better at not overthinking something before I try doing it, but I still have a long ways to go. I’ve been working on a web clipper with highlighting. I have been working on the code and almost have a Minimum Viable Product, but I’ve also gotten stuck at points when I had to make a technical decision and have spent a ton of time whiteboarding and reading about other projects that are doing similar things. The research has been useful, but just this past week I got back into the coding process. What came up quickly was that there were some nuances of the code that I hadn’t even considered being an issue during my research phase. The research was helpful, but it didn’t give me all the answers. The only way to think through something completely is to think about it as you are actively doing it.

There's lots of nuance around files vs apps. There has been an article that has gotten some people choosing sides between whether apps or files are better. I don't have a definitive answer, but here are some things to think about in this area.

Part of the argument is that proprietary files lead to data loss because apps don't last forever. In theory, this is true, but MS Word files are proprietary and have been around for a very long time and can be read by other apps such as Libre Office.

JSON is a file standard, but the data in the file can take unlimited shapes. There is no guarantee (or really, likelihood) that one app can understand JSON from another app without a lot of intervention. (Cambria is one solution for this.)

Even plain text faces this interoperability problem. We have syntaxes, such as Markdown, that we use in our text, but there are many different flavors of Markdown that all do not necessarily work together. The more complex you get with adding special meaning to plain text with syntax, the harder it is to read.

Complex things are also hard to write in text. A nicely formatted table is easy to read in Markdown (technically MultiMarkdown, since default Markdown doesn't have table support), but it is so difficult to write and edit and keep everything lined up. It is also easy to leave off a character needed for the syntax and have everything break when you run it through an interpreter.

Also, when we use syntax like Markdown, we break the semantics of special characters. A * symbol is originally used for defining footnotes (true its original meaning was arbitrary, as technically every symbol is), but in Markdown it can mean a bullet point or that text should be italicized. The original meaning has changed. We then have to escape these special characters when we try to use them as originally intended.

I don't think there is a perfect solution. A theme that I am constantly bringing up in tech conversations is that every solution requires tradeoffs. I definitely see the benefits of text based files. In certain environments, text based files are the most versatile, such as on a desktop computer. Files can be indexed. There's command line tools to batch manipulate text files. Some operating systems even do version control for changes to text files. But all of these features are not inherent to the text file itself, but come from systems that have built up around the format.

On the web, because JavaScript so easily can manipulate JSON-like objects, JSON is winning as a favorite file format. There are tools to store, index, and batch-manipulate JSON files in the same way text files are handled on desktop environments. Again, it is not because JSON is inherently better, but because tools are there to make it powerful.

The problem with standards, is that unless absolutely everyone agrees to use it, then you have to find a way to inter-op with people who don't conform to the standard. From experience, I can safely say there will never be a time when we all agree on a standard. Even for something as simple as CSS there are dozens of different ways to write it and then it all gets compiled to CSS in the end (I personally prefer SASS). The fact that they all compile to the standard makes it less of an issue. You can use what you already know and works for you.

Every medium for ideas is corruptible. There’s no magic file format. Most are likely to last long enough for you to convert to something else if need be. It’s more important to find the constraint that works for you.

CJ Chilvers

I am currently working on an app for annotating documents. I have stalled because I am in a crisis about whether to save the annotations in an atJSON style offsets list or as W3C JSON-LD Annotations. I will probably need at some point to convert between the two, so the truth is, the format used to save it as is kind of trivial. You chose one format as the source of truth and convert to the other. From there, there are already existing tools to output JSON offsets as Markdown or HTML. (I am currently leaning towards W3C annotations, since they are more of a standard, but JSON offsets will be best for annotating content that is constantly changing.)

So if I am making an app, I may not put everything in a text file. But as long as I have a way for users to get their data out by export or automatic conversion to a common file format, then I believe I meet the criteria for longevity of data.

The primary way I use Twitter is through lists. I have two lists (both currently private): one for history and one for what I call “Future Web,” which is an assortment of people who are thinkers and practitioners that are working on the semantic web, tools for thought, web3 (though generally not the blockchain version but the IPFS and similar technologies version), and web archival.

I use lists because I mostly use Twitter for learning. I have many different interests and at certain times my mind is more focused on one area. So when I am in the mood for history, I can open that list (props to Twitter for the new design on mobile that puts lists on the top of the screen so they are easy to access).

I slowly add to these lists by checking out retweets from people already on my list. I also occasionally remove people from the list because the subject matter falls outside of my main interests. I add someone to Future Web because I want to see their thoughtful ideas about tech. If they start talking too much about politics, do too much self-promotion, or go on too many rants, I remove them from the list. This is hard to do because you feel a little guilty. Also, I don’t want to put myself in an echo chamber and only hear what I already believe about things, but I also have to acknowledge that my mental capacity is limited and I cannot read everything from everyone.

I have used mute features at certain times when certain things had Twitter in an uproar. (I set a mute filter on the word “Twitter” after Twitter was bought by Musk. How meta is that?) But these filters are limited in their functionality. They can only block certain words, but cannot block ideas.

Let’s give an example. If I didn’t want to have any tweets about Large Language Models (LLMs), such as ChatGPT in my feed, how would I do it? It would be pretty hard, because everyone, including people who you never would have thought would care about AI are talking about them. I could start by muting tweets with “ChatGPT” in them. But there are other language models, so I would also need to add words such as “GPT-3,” “ESMFold,” “MT-NLG,” “OpenAI,” and “AI.” But some companies use generic words for their products. There is one language model named “Bloom.” If you added that to your list of muted words, you would probably block content that had nothing to do with LLMs.

Manually blocking keywords probably wouldn’t be enough to filter out content. There are so many words used for any subject and there is overlap of words which make it impossible to filter out a subject completely in this manner.

Artificial intelligence, including (somewhat ironically, considering the example I chose) LLMs, is probably part of the solution. Though, I do have a hard time blindly putting my trust in AI. Facebook has the option to click on a post and “see fewer posts like this.” What exactly does that mean? It gives no clue of how it will filter out the content. And it might be thinking of different criterial than you are thinking. The post might be a picture of a cat in a cardboard box, and you click the button thinking it will show you fewer posts about cats, but the algorithm interprets it to mean you want to see fewer posts about cardboard boxes.

I keep saying it, but I think the semantic web can help with this. It allows content to be concretely associated with concepts. I don’t think a writer of a social media post should have to tag each word with RDFa tags, but what if after a user posts, then AI can identify the concepts in the post. When I click on “see fewer posts like this” it can bring up a list of concepts found in the post. You can then manually select the concepts to remove from your feed.

The semantic web is a beautiful promise. All data has real meaning and it can interoperate with all other data. I can write an article about Abraham Lincoln and mark it up with RDFa so that it is known who I am talking about. Then, some type of reasoning can be performed and search other datasets and get back info about Lincoln, such as when and where he was born, how tall he was, who his spouse was, etc.

But the problem is, there is no easy way to implement what I just described. There are people working on the idea. Some of the demos look very impressive. But for now, they are just demos. There are some JavaScript developer libraries that are aimed at making it easy to work with RDF data, but it seems like most of them aren’t ready for modern development (for example, most of the libraries I tried only work with Webpack, which was the main bundler a few years back, but new bundlers like Vite are becoming default). A lot of them are also created as part of a research project and then development seems to stop on them after the research is over.

What I want is for all my data to work together. I want to be able to have a group of browser tabs associated with my to-do list. Then I can close the tabs, and reopen them in one click from my to-do app. I also want to be able to write notes in a separate app and include tasks in them that show up in the same to-do app. I want to be able to associate PDFs and other files with my to-do app—not in an upload attachment type of way, but just to reference a document that lives somewhere else from my to-do app.

A lot of apps and services provide APIs. Some apps use APIs from other apps to create connections in their app. For example, a todo app can use a specific API from a specific calendar app to create a connection from your to-dos to your calendar. But that is just for one specific calendar app. There are many different calendar apps, and all of them have different APIs. So the developers have to learn them all and write up the connections.

I remember when IFTTT (If This, Then That) first came out. It seemed like magic. It allowed you to connect so many different services, even if the developers of those apps didn’t write specific connections. I didn’t really know any kind of scripting at the time, so it allowed me to connect my different buckets of data together. Of course, it wasn’t perfect. Data syncing was just one-directional. The two data buckets didn’t actually know about each other. One data bucket just slurped in data that was passed to it. There was no way to update that data and send it back. (Of course, you could create some kind of connection back the other way, but it would create a new data entry and not really update the original data.)

What if we had an IFTTT for the semantic web? What if I could set up connections between data sets without knowing any coding? I’m not exactly sure what it would look like. Since most RDF data exposes SPARQL endpoints, that would probably be the connecting secret sauce. (There is also Project Cambria, which is a non-semantic-web way of connecting two different data sources together.) We would still need tools to make authoring RDF data easier. And easier ways to integrate RDF data into our development. These would be the foundations to make something like this even possible.

There is no lack of Tools for Thought (TfT) applications today, but all of them require you to store all your data in one application. What if we could figure out a way to use different apps for different tasks but work from the same documents and data?

Motivation

I am a history buff. Not in the normal way of going to museums or watching documentaries, but in the way of searching for hours through digital newspaper archives looking for certain information to answer a question and then writing about what I found.

I used to use Evernote as a system for managing research for history writing. Evernote has an excellent web clipper, so it was really easy to get things into it. It was not so easy to do anything with the notes and files when they were inside it. I tried creating an elaborate system of notebooks (folders) and tags and internal links, but it became obvious that the app was not meant for that level of use.

Evernote can store any kind of file, but it can only open a handful of file types. I would open the different file types in separate reader apps, each of which had a unique annotation system. I was able to get Evernote to pull in most of the annotations using various hacks, but the sync was one-directional and the annotations were not linked back to the source.

I built a system on top of Sanity, which is a hosted, headless CMS. It was really easy to create data types, so I could have entities like people, places, or dates that could be linked to in blog posts. They have a powerful query language, so I was able to list all posts that mentioned an entitiy. I created an auto-fill component in the CMS that would let you search Wikidata for an entity and it would populate information, like name, date of birth, and a description. I also started building a web clipper that would help me collect data into the system.

At first, I liked the system and was able to use it post some research, but I eventually decided to move away from this solution, too. Besides some quirks and limitations in the API (some of which have been fixed since I tried the project), the main issues I had with the platform were that it wouldn't work offline and the only option was for Sanity to host your data.

Guiding ideas

My goal is for researchers to be able to collect digital sources (websites, documents, media files, and named entities), make annotations on any file type, use these sources and annotations in their writing, and then link back to the original sources.

Avoid lock-in

I have heard about some powerful and popular Tools for Thought applications. I haven't deeply used any of them, because, after my experience with Evernote, I am afraid of getting locked into something that doesn't quite meet my needs. I want to explore ways for people to be able to use my apps without feeling that it is all or nothing.

Small apps for specific tasks

In the old days when we had all our files on our local hard drives, we could use many small applications that had very specific uses. I want to try to capture that flexibility and interoperability between apps. Instead of creating one app that does all the things I need, I can create smaller apps that handle specific activities but can all understand the same files and data.

Smaller apps also mean faster development time and they avoid the complexity and slowness that can come with apps that try to do too much.

Interoperability

If each tool is opt-in, then other developers could create apps that work in the ecosystem. The user could use them without having to stop using the other apps they have been using. To increase this possibility of interoperability, as much as possible, I want to take advantage of standards like RDF or JSON-LD.

Components of the ecosystem

These are the current pieces of the ecosystem that I am working towards:

  1. Web Clipper – Save web pages, PDFs, EPUBs, and media files. I have started on this and have a basic implementation (almost) working.

  2. Reader – Read all different files with user preferences and the ability to highlight and annotate. Annotations also can include Named Entity Recognition to find names of people and places in the content. I currently have a basic reader (without annotations) combined with the clipper, though they will probably be separated at some time.

  3. Annotation manager – A way to view all your annotations in one place. This would probably be a stepping stone to the studio, as mentioned in the next point.

  4. Authoring Studio – Write and combine annotations into notes or blog posts. I imagine something close in concept to current TfTs like Roam or Tana.

Besides these, the possibilities are many, including Dropbox-like local sync or an app that would take a video or audio file, create a transcript with voice-to-text, and then allow annotations on the content.

Technical foundations

  • Content addressing (IPFS, WNFS) – If I have a copy of a PDF and you have a copy, then we can “talk” about the same file. This will also increase the lifetime of a source document because multiple people can host the file (this cuts down on link rot).
  • Stand-off properties (atJSON, W3C web annotations) – To preserve content addressing, we want to manipulate source documents as little as possible. Stand-off properties allow each user to have their own annotations on a shared document. These properties are essentially optional layers on top of the source document.
  • Semantic data (RDF, JSON-LD) – There have to be ways for other applications to know what is being talked about. It will also allow you to find relations between items that you have collected.
  • Collaboration/interoperability (Activity Pub, Linked Data Notifications) – This would allow people to join groups and get notifications when someone made annotations on a document I have an interest in.
  • Offline access (WNFS) – It is important for me that apps are fast and accessible at all times because I spend a lot of time in Africa where there is metered and sometimes spotty network access.

Conclusion

I have been working on this part-time as a side project when I am not working for my day job as a front-end developer. Inspired by Linus Lee's ideas of creating tools that you would use yourself, I am trying to get to what I am calling MMP (Minimum Matt Product) so that I can use these tools in my research and everyday reading. If I can get it to where I will use it every day, then I will keep working out the details so that it can be usable by more people.

The promise of technology

There was a time in my late teens/early twenties when I was enamored by technology. I had swapped out my Windows laptop for a Macbook Pro. Every week, I eagerly awaited the new episode of the Mac Power Users podcast. While waiting for the new episode, I would work through the backlog.

I was fascinated by what were called “workflows” and how different apps could be automated and pass data back and forth between each other. This was achieved using applications like Automator, TextExpander, Hazel, and Alfred. These apps came with a semi-hefty price tag (at least it seemed so for a teen that worked at Subway), but it was a one-time price, so if you bought it, you owned it forever. Sometimes upgrades to major versions cost money, but you could opt out and still use the old version.

Spotlight indexed everything on the computer, so I could search and see my Evernote notes right next to my emails and Pages documents. Apps like Picasa and iTunes also indexed your photos and music, respectively. You had all your data in your control, and you could use a Time Capsule and/or Carbon Copy Cloner to seamlessly make sure if something happened to your computer, you wouldn’t lose your files. (I used both and also Carbonite for off-site backup.)

Dropbox came out, and it was like magic! You could sync all your files between devices. You still maintained full control over them. Mobile apps would connect to the Dropbox API, so it would even (sorta) work on your phone and tablet.

Broken promises

Fast forward to today, and I’ve lost almost all of that fascination with technology. I work on the web, so I am at a computer more than ever before in my life. I switched my laptop back to Windows partly because I refused to pay the “Apple tax” and partly as a protest against Apple’s ethos of a walled garden.

I don’t buy a new phone every year anymore, either. I’ve had the same phone for three years, and I’m hoping to get at least one more year out of it.

There are quite a few reasons why my feelings about technology have changed. Some of them are personal, and some are because of how technology evolved.

Personal growth

I’ll start with some positive reasons. It’s kind of simple: I’ve matured as a human being. I am more thoughtful about how I spend my money. My three-year-old phone does basically all that a new phone does. It has a good camera, and people still give me compliments about how good the camera is. Unless it breaks or you have something significantly different to offer, why would I upgrade? (I have been tempted more than once to buy a foldable phone because it is extremely different than what I have.)

I also want to spend less time on technology. I don’t want to be identified by technology. I don’t want to be either “the guy who’s always on his phone” or “the computer guy.” (Not really positive that I’ve succeeded in this, but I’m trying in some ways.)

The Internet changed everything (and not all for the good)

Going back to talk about the tech itself, everything is different because the Internet has gotten so much more powerful. You don’t need to have your laptop do a bunch of automation stuff anymore because everything is done on the web. In fact, as things move to the web, there are some things that are no longer possible to manipulate on a computer. Companies like Twitter and Facebook have changed how their APIs work, so what can be retrieved from them is limited. We have online-only file formats such as Figma that can’t be accessed without the Internet.

Instead of controlling your data and syncing it with a service of your choice, companies all have their own proprietary ways of managing your data.

Because they have control of your data and you can’t easily access it without the company, they can charge money each month for you to be able to access your data. (Is it an exaggeration to call it ransomware?) Every service charges for their one little utility, and at even $2.99 a month for a service, in less than a year, it is much more expensive than most of the one-off apps you bought for the laptop. Surprisingly, even though they already charge a subscription price, many also sell your data to advertisers.

There is web automation in the form of IFTT and Zapier, but they don’t really match the power of Hazel and the like.

Not to mention, web stuff seems really buggy. It is never finished, so it seems to be in a constant flux of features being added that also add bugs.

Making new promises

All that said, I haven’t given up on technology. I may not like where it is, but I am hopeful for what it can become. The web is powerful, and it opens up possibilities that weren’t available when things stayed locally on our separate computers. It’s going to take time for things to be as seamless as they felt on local computers because things are a hundred times more complicated on the web. There are issues of security, scale, and interoperability. And, yes, these things will take time and effort to figure out, so there will need to be a constant stream of money to make it progress. (Exactly how and to whom this money should be directed is a large discussion in itself.)

I still listen to technology podcasts (and read blogs and books), but the focus isn’t on using existing tech and apps. But it is on how we can make better tech that solves these problems. There are many smart people looking to solve these problems. They want to help the web reach its potential.

I am starting this blog because I want to bring more notice to the people working on these things and also add my ideas to the conversation.

Enter your email to subscribe to updates.