Decentralized Search in 2020

rushabh · 28 January 2020 03:03

Parking ideas for a decentralized search engine from first principles in 2020

Goal - to build a new age search engine (fun project) that is semantic and decentralized.

Some initial thoughts

Multi-nodal desktop app
- Electron UI (or PWA)
Central node (CN)
- There will be a central node which will maintain:
  - Metadata (models)
  - Views (a model can have multiple published views - like “data grid”, “catalog”, “photos”, “conversation”, “resume” etc.
  - Extractors (scripts)
  - Directory (of all nodes)
Modeling
- The user should be able to add / edit models. A user can contribute metadata to the CN
- Semantic modeling (Bootstrap models from schema.org)
- Features:
  - Ability to add / edit models
  - Ability to add / edit objects
Data Extraction
- Users should be able to write and publish “extractors”
- Source Plug-in architecture (example Wikipedia plugin, StackOverflow plugin, Reddit plugin, Amazon plugin etc.).
Data Source
- Users should be able to store and publish objects.
- Users should also publish data-directory
- The source plugins will be extract information and map it semantically in a local SQLite database.
View
- Users should be able to write and publish views in HTML/CSS
- Multiple object view. with plugins for rendering objects as list, graphs, images
- Single object view. with templates for each object type and also showing modeled objects with internet links for further search
Sharing
- Ability to publish / share models, objects, extractors, views in-app
Network
- Offline first
- Personal node first
- Personal node can be setup as a server
- Ability to create a peer network (node to node)
- Ability to search a peer via a plugin

Reading

Pears: https://pearsearch.org/

Blog on personal search: https://beepb00p.xyz/pkm-search.html

knadh · 28 January 2020 13:46

Interesting idea. Seems more like a knowledge store/graph than a search engine. SQLite / or an RDBMS probably wouldn’t be the best way to model semantic relationships. A graph DB would be better suited.

Speaking of knowledge graphs, check out https://cayley.io and https://perkeep.org/ (previously Camlistore. Not a knowledge graph but a personal “life store”).

nilesh · 8 February 2020 13:55

An interesting project is https://connectordb.io/ - intended to be a personal Google Analytics equivalent (known as “Quantified Self”). It has become dormant but they did make a number of data collection software for various platforms (browsers, OS).

rushabh · 2 March 2020 09:13

A social search (powered by fossasia) https://loklak.org

Bhupesh_Varshney · 30 August 2020 14:16

If we talking about a search engine, then we possibly don’t need a Electron UI for this.
Moreover, we might be able to leverage IPFS for distributed storage needs as well.
(I have yet to figure out on a model on how this will actually work, but for e.g

Every node gets preference if they help us index the site

Just like if we seed files (in a torrent based Architecture) we get more download speeds.
That’s where the Search Engine Algorithm comes, I mean ok we are taking it as a “fun” project but I think engineering this is important as well.

rushabh · 30 August 2020 15:16

I also realize there are 2 steps to “freeing” a service

Make it community owned
Make it distributed

While we jump to #2, we should not forget #1. This is where we hope FOSSUnited can be a platform to manage community owned versions of popular services like search, social, market etc.

Also #1 is much easier to do given the current internet infra. Also much easier to sustain and moderate content on community owned platforms.