linas/atomspace-ipfs: IPFS backend to the AtomSpace

原作者: [db:作者] 来自: 网络收藏邀请

开源软件名称：

linas/atomspace-ipfs

开源软件地址：

https://github.com/linas/atomspace-ipfs

开源编程语言：

C++ 82.2%

开源软件介绍：

atomspace-ipfs

IPFS driver backend to the AtomSpace (hyper-)graph database.

The code here is a backend driver to the AtomSpace graph database, enabling AtomSpace contents to be shared via the InterPlanetary File System (IPFS) network. The goal is to allow efficient decentralized, distributed operation over the global internet, allowing many AtomSpace processes to access and perform updates to large datasets.

The AtomSpace

The AtomSpace is a (hyper-)graph database whose nodes and links are called "Atoms". Each (immutable) Atom has an associated (mutable) key-value store. The Atomspace has a variety of advanced features not normally found in ordinary graph databases, including an advanced query language and "active" Atoms.

IPFS

IPFS, the InterPlanetary File System, is an internet-wide globally-accesible file system, built on top of a distributed hash table for addressing files by content, wherever they may be located on the network. It provides decentralized file storage.

Important Notice !!

There are fundamental design issues with this implementation that appear to be unresolvable with the current API's offered by IPFS. All potential users and developers are strongly urged to make use of the OpenDHT-based AtomSpace backend instead!

It appears that the OpenDHT API is an excellent fit for the AtomSpace, providing exactly those kinds of features that the AtomSpace needs!! Woo Hoo!!

Because of this situation, it seems unlikely that the development of the IPFS driver will continue beyond the current version. Again, please use the OpenDHT-based backend driver instead!

Beta version 0.2.0

The driver here was developed and tested with IPFS version 0.4.22-.

Status

In the current implementation:

A design for representing the AtomSpace in IPFS has been chosen. It has numerous shortcomings, detailed below.
System is feature-complete. All seven unit tests from the original Atomspace-SQL test suite has been ported. Six of the seven pass -- the seventh one is a multi-user test, and the multi-user design used here is (deeply) flawed. See further comments below. The basic tutorials work as documented. So basically, everything works, as long as only one user at a time is modifying the AtomSpace contents. Multiple users could edit the same AtomSpace, if they were careful to exchange with each-other what their latest CID was. Otherwise, each user ends up forking the AtomSpace, and the forks never get merged back together again. This is a design flaw: IPFS does not provide any way of doing decentralized set membership. This forces the entire AtomSpace to be mapped into just one file, making it very highly "centralized". Since it's just a file, it can be forked. See comments below.
Due to IPFS bugs with the performance of IPNS, IPNS is mostly unused in this implementation. This means that users need to arrange other channels of communication to find out what the latest AtomSpace is (by sharing the AtomSpace CID in some other way, rather than sharing via IPNS).
Many or most operations are slow. Like really, really slow. Like, a dozen-atoms-per-second-slow. Which is unusable on a production database. In a few cases, performance could be improved by better caching. In most cases, this is a fundamental limitation of the current design. If might be a fundamental limitation of IPFS, since IPFS is not optimal for handling very small objects, and (most) Atoms are just tiny.

Centralization

There does not seem to be any way of mapping the AtomSpace into the current design of IPFS+IPNS without resorting to a single, centralized directory file listing all of the Atoms in an AtomSpace. Implementing a single, centralized directory file seems like a "really bad idea" for all of the usual reasons:

When it gets large, it does not scale.
Impossible to optimize fetch of atoms-by-type.
Hard to optimize fetch of incoming set.
Unresolvable update conflicts (race conditions) when there are multiple writers (i.e. with multiple writers, it's unclear which of the published AtomSpace versions are authoritative. As a result, each writer effectively creates a forked version of the AtomSpace, and there's no particular way to merge the forks back together again. See MultiUserUTest for a failing example of the resulting badness.)
Performance bottlenecks when there are multiple writers.

Despite these design flaws, I went ahead and wrote the code anyway. It helped clarify the issues. A better design is needed, but that better design seems to be blocked without core changes to the IPFS core system. Decentralized updates are sorely needed.

A suitable decentralized design would be possible, if IPNS was extended with one additional feature (or if some other system was used, taking the place of IPNS). Currently, IPNS does this:

    PKI public-key ==> resolved CID

The ideal enhanced-IPNS lookup would be this:

    (PKI public-key, hash) ==> resolved CID

Details are described below.

Known Bugs

There are several bugs that are known, but are problematic to fix:

Centralized directory, as noted above.
Race conditions if multiple users update the same AtomSpace at the same time. These race conditions will result in lost data (lost Atom inserts, deletes, or lost changes of TruthValues or other Values.)
Atom removal is a heavy-weight operation, due to heavy interaction with incoming sets.

Architecture:

This implementation provides a full, complete implementation of the standard BackingStore API from the Atomspace. Its a backend driver.

The git repo layout is the same as that of the AtomSpace repo. Build and install mechanisms are the same.

Design requirements:

To get any hope of uniqueness and non-collision of Atoms, this will require that each atom will get it's own globally unique hash, viz a crypto-secure 256-bit (32-byte) hash. This is considerably larger than the current non-crypt-secure 64-bit hash used in the current AtomSpace implementation.
To avoid hash collisions, the Atom Type has to be hashed in, as a string, as numeric Atom Type assignments (currently 16-bit short int's in the AtomSpace) cannot be made global safely.
Although Atoms are globally unique and immutable, the associated values are mutable, and also vary depending on which AtomSpace they belong to.
How do we associate mutable data to an Atom? Specifically: -- the Values on the Atom. -- the various AtomSpaces the atom belongs to. -- the slowly changing Incoming Set. To summarize: Values and Incoming Sets are aspects of the AtomSpace, and not of the Atom itself. Different AtomSpaces will typically see different Values and different incoming sets for any given Atom. (and any given Atom might not even belong to a given AtomSpace).

Design choices and issues:

The first two bullets are satisfied by writing the Atom type and it's name (if its a Node) as text into a file. For Links, the outgoing set can be placed in the IPLD links[] json element. These will be automatically hashed by the IPFS subsystem, delivering a true globally unique ID (the CID) for the Atom, exactly as desired.
We distinguish between the GUID of the Atom, and the CID of the Valuation. The GUID of the Atom is it's IPFS CID, when considering only the Atom itself, and not it's values or incoming set. Thus, it really is globally unique and non-varying. By contrast, the file containing the Valuations will change whenever the Values change, and so the CID attached to an Atom will be changing. The tricky part of the design is to associate this immutable GUID, with the current CID for that Atom.
Conceptually, the AtomSpace is nothing more than a set of (GUID, CID) pairs, such that there can only ever be just one CID per GUID. That is, the AtomSpace is a time-varying map from GUID to CID. This automaticaly enforces the other requirements:
- If a GUID is not a part of the AtomSpace, then that Atom is not in the AtomSpace.
- The Atom can have other Values in other AtomSpaces, but it can have only one Valuation in this AtomSpace.
Each read-only AtomSpace corresponds to a directory, so that each Atom appears in the links[] json member of the directory. Updated AtomSpaces are published on IPNS, so that a read-write AtomSpace corresponds to a unique key (the key used to generate the IPNS name). That is, when an Atom is added to/removed from the Atomspace, the links[] list is modified to add/remove that Atom, thus creating a new IPFS CID. Then IPNS is updated to point at this new AtomSpace CID. The good news: one knows exactly which version of an AtomSpace one is working with (this is very unlike the current AtomSpace!)
Currently, IPNS is slow. A core assumption in the design is that someday, this will be fixed, and IPNS will be fast. Or that, at least, the IPNS latency will be immaterial, and that we'll work with the most-recently-resolved values.
Design alternative A:
- Every Atom has a corresponding PKI key. So, millions of keys. The key name is globally unique: it is just the name of the AtomSpace, followed by the scheme string of the Atom.
- The private part of the PKI key is held by the key-creator. It stays private, unless shared.
- The AtomSpace is a single file, listing all of the Atoms in it, together with all of the public keys for each Atom. This means that the AtomSpace is centralized.
- If a user wants to find the current Valuation of an Atom, they must:
  - Obtain the AtomSpace file somehow (either someone gives the user the CID of the current AtomSpace, or the user obtains the CID from an IPNS lookup.)
  - Look up the Atom in that file, if present.
  - Examine the public key of that Atom.
  - Perform the IPNS lookup for that key.
  - Fetch the file corresponding to the CID that IPNS returned.
  - Parse the file, extract the desired Value.
- Incoming sets are stored along with the Valuation file.
- If a user wants to change (update) the Valuation of a Atom, they must:
  - Obtain the private key for that Atom/Atomspace combination, by asking someone for it.
  - Update the Valuation file.
  - IPNS publish the new file.
  Note that the updates to the IncomingSet are conflict-prone. So a CRDT format for IncomingSets is required.
Issues:
- Publishing a single large AtomSpace file is ugly; it prevents simultaneous, high-speed updates. It's centralized and not scalable. There's conflict resolution issues if there are multiple updaters.
Status:
- The above was NOT followed, in that IPNS was avoided. Instead, there is a master AtomSpace file containing only IPFS CID's for Valuations.
Design Alternative B: Perhaps it is possible to store mutable values using the DHT API directly? This would also allow alpha-conversion issues to be handled (as we'd alpha-convert to a unique combinator form, and hash only that.)
Q: How to load the incoming set of an Atom? Currently, the incoming set of an Atom is stored as part of the mutable version of that Atom, and can therefore be fetched. The IPFS CID of the current mutated Atom is obtained by lookup of the AtomSpace (from the single, large directory file that the AtomSpace is stored in).
Q: is Pin needed to prevent a published atomspace from disappearing? Doesn't seem to be!? (Yet. As long as my IPFS daemon stays up...)
Idea: Use pubsub to publish value updates.
The current encoding is gonna do alpha-equivalence all wrong. As long as there is a single writer, then the AtomSpace can hide this via the usual alpha-renaming techniques. But in a multi-user setup, this will surely lead to distinct-but-alpha-equivalent Atoms. The core problem is that we cannot tell IPFS to skip certain parts of the file, when computing the content hash. Maybe this is possible with direct DHT access?

IPNS++

It currently appears to be impossible to map the AtomSpace into IPFS without resorting to a single, centralized file that contains the AtomSpace contents. Clearly, this would be a bad design, for all of the usual reasons associated with centralization.

However, a good high-quality, truly decentralized design would be possible if IPNS was modified slightly. Currently, IPNS does this:

    PKI public-key ==> resolved CID

The ideal IPNS lookup would be this:

    (PKI public-key, hash) ==> resolved CID

If the above were possible, the AtomSpace mapping would become straightforward: The hash would be the hash of an Atom, and the resolved CID would contain the Values associated with that Atom. This works, because the hash of an Atom is globally unique: anyone can know what it is. Anyone having access to the public-key would then have read-access to that particular AtomSpace. Anyone having the private key would have write access. All operations are distributed, decentralized, assuming that the lookup itself can be made decentralized.

Of course, its easy to create a centralized hash lookup: a single large file containing a list of hash ==> CID mappings. But that suffers from all the typical problems of centralization: the multiple-writers problem, problems with being a bottleneck for updates, file-size issues. etc.

Build Prereqs

Clone and build the AtomSpace.

Install IPFS Core. On Debian/Ubuntu:

curl https://get.siderus.io/key.public.asc | sudo apt-key add -
echo "deb https://get.siderus.io/ apt/" | sudo tee -a /etc/apt/sources.list.d/siderus.list
sudo apt update
sudo apt install ipfs

Get familiar with IPFS. Some useful commands:

ipfs init
ipfs daemon
ipfs swarm peers
ipfs commands

Install the IPFS C++ client library

https://github.com/vasild/cpp-ipfs-api Uh, no, actually, you need the extended, updated version, here: https://github.com/linas/cpp-ipfs-api and and so then git clone https://github.com/linas/cpp-ipfs-api and git checkout master-linas and mkdir build; cd build; cmake ..; make -j; sudo make install

This needs the package "JSON for Modern C++": sudo apt install nlohmann-json3-dev

API documentation is here: https://github.com/ipfs/interface-js-ipfs-core/tree/master/SPEC and here: https://vasild.github.io/cpp-ipfs-http-client/classipfs_1_1Client.html.

Building

Building is just like that for any other OpenCog component. After installing the pre-reqs, do this:

   mkdir build
   cd build
   cmake ..
   make -j
   sudo make install

Then go through the examples directory.

鲜花

握手

雷人

路过

鸡蛋

该文章已有0人参与评论

请发表评论

全部评论

专题导读

More+

10-27 六六分期app的软件客服如何联系？(六六分期

11-06 可心卡盟:win10系统火狐flash插件崩溃怎么

11-06 亲亲特价:怎么删除回收站图标

11-06 济南大学虚拟社区:鲁大师节能降温的具体办

11-06 xlueops.exe:无线网络安装向导

11-06 女斗合众国:win7系统cf与主机连接不稳定怎

11-06 0xc000022-[cf烟雾头]cf怎么调烟雾头

11-06 qizideyouhuo:应用程序无法正常启动0xc0000

11-06 ipz-185:win7系统vcf文件怎么打开

11-06 傻哥蹦迪:win10系统s4怎么打开usb调试

11-06 八神浩树gtaste:回收站清空了怎么恢复

11-06 妖尾之黑色守护:win10系统电脑没有1440x900

11-06 校园至尊魔王小说:win7系统浏览网页时字体

11-06 女斗合众国:win10系统访问共享文件夹提示请

11-06 tokyo hot n0654:恢复win7系统默认字体一招

11-06 雨酷仙境:设置win7系统转移临时文件夹腾出

11-06 阿穆纳伊之杖:win7系统开始菜单在右边还原

11-06 tunespotting:win10系统火狐flash插件总是

11-06 甘尔葛分析师：计谋网站seo关键词暴涨有什

11-06 蔡贵霖: 计谋网站seo关键词暴涨有什么秘密

11-06 博益网首页:ao3网页版进入不了解决方法

11-06 漏斗子专栏: 网站数据分析小白易懂精华篇

11-06 见证双虹怎么做:win7系统开启telnet命令的

11-06 颾狐蝶蜋:系统资源不足无法完成请求的服务

11-06 国光中学校歌:提交网站到alexa查询详细步骤

11-06 西安有情天:静态网页和动态网页的区别

11-06 红木雅尚斋:外部链接构造对网站的好处

11-06 前官礼遇：防止域名劫持–增强域安全性的10

11-06 密传二转答案: 中文分词算法有哪些

11-06 金泉家园邮编:百度快照劫持的表现及应对方

alanshaw/ipfs-ds-postgres: 发布时间：2022-06-22

nftstorage/nftstorage.link: 发布时间：2022-06-22

剪的笔顺,诠释剪的笔画,认识剪的部首

1 六六分期app的软件客服如何联系？(六六分期

六六分期app的软件客服如何联系？不知道吗？加qq群【895510560】即可！标题：六六分期

阅读：18277|2023-10-27

2 可心卡盟:win10系统火狐flash插件崩溃怎么

今天小编告诉大家如何处理win10系统火狐flash插件总是崩溃的问题，可能很多用户都不知

阅读：9677|2022-11-06

3 亲亲特价:怎么删除回收站图标

今天小编告诉大家如何对win10系统删除桌面回收站图标进行设置，可能很多用户都不知道

阅读：8179|2022-11-06

4 济南大学虚拟社区:鲁大师节能降温的具体办

今天小编告诉大家如何对win10系统电脑设置节能降温的设置方法，想必大家都遇到过需要

阅读：8549|2022-11-06

5 xlueops.exe:无线网络安装向导

我们在使用xp系统的过程中,经常需要对xp系统无线网络安装向导设置进行设置，可能很多

阅读：8457|2022-11-06

6 女斗合众国:win7系统cf与主机连接不稳定怎

今天小编告诉大家如何处理win7系统玩cf老是与主机连接不稳定的问题，可能很多用户都不

阅读：9393|2022-11-06

7 0xc000022-[cf烟雾头]cf怎么调烟雾头

电脑对日常生活的重要性小编就不多说了，可是一旦碰到win7系统设置cf烟雾头的问题，很

阅读：8430|2022-11-06

8 qizideyouhuo:应用程序无法正常启动0xc0000

我们在日常使用电脑的时候，有的小伙伴们可能在打开应用的时候会遇见提示应用程序无法

阅读：7865|2022-11-06

9 ipz-185:win7系统vcf文件怎么打开

今天小编告诉大家如何对win7系统打开vcf文件进行设置，可能很多用户都不知道怎么对win

阅读：8416|2022-11-06

10 傻哥蹦迪:win10系统s4怎么打开usb调试

今天小编告诉大家如何对win10系统s4开启USB调试模式进行设置，可能很多用户都不知道怎

阅读：7394|2022-11-06

客服电话

电子邮件

linas/atomspace-ipfs: IPFS backend to the AtomSpace

开源软件名称：

开源软件地址：

开源编程语言：

开源软件介绍：

atomspace-ipfs

The AtomSpace

IPFS

Important Notice !!

Beta version 0.2.0

Status

Centralization

Known Bugs

Architecture:

Design requirements:

Design choices and issues:

IPNS++

Build Prereqs

Building

请发表评论

全部评论

上一篇：

下一篇：

librespeed/speedtest: Self-hosted Speedt

avehtari/BDA_m_demos: Bayesian Data Anal

四维彩超怎么看性别？四维看男孩女孩诀窍

ant-design/ant-design: An enterprise-cla

medfreeman/markdown-it-toc-and-anchor: m

剪的笔顺,诠释剪的笔画,认识剪的部首

六六分期app的软件客服如何联系？(六六分期

florent37/ViewAnimator: A fluent Android

florent37/Shrine-MaterialDesign2: implem

CVE-2020-36276

SimpleSoftwareIO/simple-sms: Send and re

关于我们

产品与服务

解决方案

139-2527-9053