Compare commits
16 Commits
e66e3202da
...
main
Author | SHA1 | Date | |
---|---|---|---|
028bc1df84 | |||
82a017f624 | |||
9d9f8f6d72 | |||
e01aa9a607 | |||
3055ee37df | |||
a4f749ebd7 | |||
0ecc0ecd3a | |||
657ced1ceb | |||
d21158eb91 | |||
98299daa1b | |||
6020db4d15 | |||
2b06a5f866 | |||
845a54787b | |||
f162bb946a | |||
00f795f16d | |||
b0f77831bd |
79
README.md
79
README.md
@ -1,8 +1,60 @@
|
||||
# lumbunglib
|
||||

|
||||
|
||||
> Python lib which powers `lumbung.space` automation
|
||||
# Konfluks
|
||||
|
||||
## hacking
|
||||
A drainage basin is a geographical feature that collects all precipitation in an area, first in to smaller streams and finally together in to the large river. Similarly, Konfluks can bring together small and dispersed streams of web content from different applications and websites together in a single large stream.
|
||||
|
||||
Specifically, Konfluks turns Peertube videos, iCal calendar events, other websites through their RSS and OPDS feeds and Mastodon posts under a hashtag in to Hugo page bundles. This allows one to publish from diverse sources to a single stream.
|
||||
|
||||
Konfluks was first made by [Roel Roscam Abbing](https://test.roelof.info/) as part of [lumbung.space](https://lumbung.space), together with [ruangrupa](https://ruangrupa.id) and [Autonomic](https://autonomic.zone).
|
||||
|
||||
## Philosophy
|
||||
|
||||
Konfluks tries to act as a mirror representation of the input sources. That means that whenever something remote is deleted, changed or becomes unavailable, it is also changed or deleted by Konfluks.
|
||||
|
||||
Konfluks tries to preserve intention. That means the above, but also by requiring explicit ways of publishing.
|
||||
|
||||
Konfluks works by periodically polling the remote sources, taking care not to duplicate work. It caches files, asks for last-modified headers, and skips things it has already. This makes every poll as fast and as light as possible.
|
||||
|
||||
Konfluks is written for clarity, not brevity nor cleverness.
|
||||
|
||||
Konfluks is extendable, a work in progress and a messy undertaking.
|
||||
|
||||
## High-level overview
|
||||
|
||||
Konfluks consists of different Python scripts which each poll a particular service, say, a [Peertube](https://joinpeertube.org) server, to download information and convert it in to [Hugo Page Bundles](https://gohugo.io/content-management/page-bundles/)
|
||||
|
||||
Each script part of Konfluks will essentially to the following:
|
||||
|
||||
* Parse a source and request posts/updates/videos/a feed
|
||||
* Taking care of publish ques
|
||||
|
||||
* Create a Hugo post for each item returned, by:
|
||||
* Making a folder per post in the `output` directory
|
||||
* Formatting post metadata as [Hugo Post Frontmatter](https://gohugo.io/content-management/front-matter/) in a file called `index.md`
|
||||
* Grabbing local copies of media and saving them in the post folder
|
||||
* Adding the post content to `index.md`
|
||||
* According to jinja2 templates (see `konfluks/templates/`)
|
||||
|
||||
The page bundles created, where possible, are given human friendly names.
|
||||
|
||||
Here is a typical output structure:
|
||||
|
||||
```
|
||||
user@server: ~/konfluks/output: tree tv/
|
||||
tv/
|
||||
├── forum-27an-mother-earth-353f93f3-5fee-49d6-b71d-8aef753f7041
|
||||
│ ├── 86ccae63-3df9-443c-91f3-edce146055db.jpg
|
||||
│ └── index.md
|
||||
├── keroncong-tugu-cafrinho-live-at-ruru-gallery-ruangrupa-jakarta-19-august-2014-e6d5bb2a-d77f-4a00-a449-992a579c8c0d
|
||||
│ ├── 32291aa2-a391-4219-a413-87521ff373ba.jpg
|
||||
│ └── index.md
|
||||
├── lecture-series-1-camp-notes-on-education-8d54d3c9-0322-42af-ab6e-e954d251e076
|
||||
│ ├── 0f3c835b-42c2-48a3-a2a3-a75ddac8688a.jpg
|
||||
│ └── index.md
|
||||
```
|
||||
|
||||
## Hacking
|
||||
|
||||
Install [poetry](https://python-poetry.org/docs/#osx--linux--bashonwindows-install-instructions):
|
||||
|
||||
@ -10,31 +62,20 @@ Install [poetry](https://python-poetry.org/docs/#osx--linux--bashonwindows-insta
|
||||
curl -sSL https://raw.githubusercontent.com/python-poetry/poetry/master/get-poetry.py | python -
|
||||
```
|
||||
|
||||
We use Poetry because it locks the dependencies all the way down and makes it
|
||||
easier to manage installation & maintenance in the long-term. Then install the
|
||||
dependencies & have them managed by Poetry:
|
||||
We use Poetry because it locks the dependencies all the way down and makes it easier to manage installation & maintenance in the long-term. Then install the dependencies & have them managed by Poetry:
|
||||
|
||||
```
|
||||
poetry install
|
||||
```
|
||||
|
||||
Each script requires some environment variables to run, you can see the latest
|
||||
deployment configuration over
|
||||
[here](https://git.autonomic.zone/ruangrupa/lumbung.space/src/branch/main/compose.yml),
|
||||
look for the values under the `environment: ...` stanza.
|
||||
Each script requires some environment variables to run, you can see the latest deployment configuration over [here](https://git.autonomic.zone/ruangrupa/lumbung.space/src/branch/main/compose.yml), look for the values under the `environment: ...` stanza.
|
||||
|
||||
All scripts have an entrypoint described in the
|
||||
[`pypoetry.toml`](https://git.autonomic.zone/ruangrupa/lumbunglib/src/commit/40bf9416b8792c08683ad8ac878093c7ef1b2f5d/pyproject.toml#L27-L31)
|
||||
which you can run via `poetry run ...`. For example, if you want to run the
|
||||
[`lumbunglib/video.py`](./lumbunglib/video.py) script, you'd do:
|
||||
All scripts have an entrypoint described in the [`pypoetry.toml`](./pyproject.toml) which you can run via `poetry run ...`. For example, if you want to run the [`konfluks/video.py`](./konfluks/video.py) script, you'd do:
|
||||
|
||||
```
|
||||
mkdir -p testdir
|
||||
export OUTPUT_DIR=/testdir
|
||||
poetry run lumbunglib-vid
|
||||
poetry run konfluks-vid
|
||||
```
|
||||
|
||||
Run `poetry run poetry2setup > setup.py` if updating the poetry dependencies.
|
||||
This allows us to run `pip install .` in the deployment and Pip will understand
|
||||
that it is just a regular Python package. If adding a new cli command, extend
|
||||
`pyproject.toml` with a new `[tool.poetry.scripts]` entry.
|
||||
Run `poetry run poetry2setup > setup.py` if updating the poetry dependencies. This allows us to run `pip install .` in the deployment and Pip will understand that it is just a regular Python package. If adding a new cli command, extend `pyproject.toml` with a new `[tool.poetry.scripts]` entry.
|
||||
|
31
konfluks.svg
Normal file
31
konfluks.svg
Normal file
File diff suppressed because one or more lines are too long
After Width: | Height: | Size: 29 KiB |
@ -138,9 +138,9 @@ def create_event_post(post_dir, event):
|
||||
for img in event_metadata["images"]:
|
||||
|
||||
# parse img url to safe local image name
|
||||
img_name = img.split("/")[-1]
|
||||
fn, ext = img_name.split(".")
|
||||
img_name = slugify(fn) + "." + ext
|
||||
img_name = os.path.basename(img)
|
||||
fn, ext = os.path.splitext(img_name)
|
||||
img_name = slugify(fn) + '.' + ext
|
||||
|
||||
local_image = os.path.join(post_dir, img_name)
|
||||
|
@ -155,8 +155,11 @@ def parse_enclosures(post_dir, entry):
|
||||
if "type" in e:
|
||||
print("found enclosed media", e.type)
|
||||
if "image/" in e.type:
|
||||
if not os.path.exists(post_dir): #this might be redundant with create_post
|
||||
os.makedirs(post_dir)
|
||||
featured_image = grab_media(post_dir, e.href)
|
||||
entry["featured_image"] = featured_image
|
||||
media_item = urlparse(e.href).path.split('/')[-1]
|
||||
entry["featured_image"] = media_item
|
||||
else:
|
||||
print("FIXME:ignoring enclosed", e.type)
|
||||
return entry
|
||||
@ -373,62 +376,66 @@ def main():
|
||||
|
||||
data = grab_feed(feed_url)
|
||||
|
||||
if data:
|
||||
if data: #whenever we get a 200
|
||||
if data.feed: #only if it is an actual feed
|
||||
opds_feed = False
|
||||
if 'links' in data.feed:
|
||||
for i in data.feed['links']:
|
||||
if i['rel'] == 'self':
|
||||
if 'opds' in i['type']:
|
||||
opds_feed = True
|
||||
print("OPDS type feed!")
|
||||
|
||||
opds_feed = False
|
||||
for i in data.feed['links']:
|
||||
if i['rel'] == 'self':
|
||||
if 'opds' in i['type']:
|
||||
opds_feed = True
|
||||
print("OPDS type feed!")
|
||||
for entry in data.entries:
|
||||
# if 'tags' in entry:
|
||||
# for tag in entry.tags:
|
||||
# for x in ['lumbung.space', 'D15', 'lumbung']:
|
||||
# if x in tag['term']:
|
||||
# print(entry.title)
|
||||
entry["feed_name"] = feed_name
|
||||
|
||||
post_name = slugify(entry.title)
|
||||
|
||||
for entry in data.entries:
|
||||
# if 'tags' in entry:
|
||||
# for tag in entry.tags:
|
||||
# for x in ['lumbung.space', 'D15', 'lumbung']:
|
||||
# if x in tag['term']:
|
||||
# print(entry.title)
|
||||
entry["feed_name"] = feed_name
|
||||
# pixelfed returns the whole post text as the post name. max
|
||||
# filename length is 255 on many systems. here we're shortening
|
||||
# the name and adding a hash to it to avoid a conflict in a
|
||||
# situation where 2 posts start with exactly the same text.
|
||||
if len(post_name) > 150:
|
||||
post_hash = md5(bytes(post_name, "utf-8"))
|
||||
post_name = post_name[:150] + "-" + post_hash.hexdigest()
|
||||
|
||||
post_name = slugify(entry.title)
|
||||
|
||||
# pixelfed returns the whole post text as the post name. max
|
||||
# filename length is 255 on many systems. here we're shortening
|
||||
# the name and adding a hash to it to avoid a conflict in a
|
||||
# situation where 2 posts start with exactly the same text.
|
||||
if len(post_name) > 150:
|
||||
post_hash = md5(bytes(post_name, "utf-8"))
|
||||
post_name = post_name[:150] + "-" + post_hash.hexdigest()
|
||||
|
||||
if opds_feed:
|
||||
entry['opds'] = True
|
||||
#format: Beyond-Debiasing-Report_Online-75535a4886e3
|
||||
post_name = slugify(entry['title'])+'-'+entry['id'].split('-')[-1]
|
||||
|
||||
post_dir = os.path.join(output_dir, feed_name, post_name)
|
||||
|
||||
if post_name not in existing_posts:
|
||||
# if there is a blog entry we dont already have, make it
|
||||
if opds_feed:
|
||||
create_opds_post(post_dir, entry)
|
||||
else:
|
||||
create_post(post_dir, entry)
|
||||
entry['opds'] = True
|
||||
#format: Beyond-Debiasing-Report_Online-75535a4886e3
|
||||
post_name = slugify(entry['title'])+'-'+entry['id'].split('-')[-1]
|
||||
|
||||
elif post_name in existing_posts:
|
||||
# if we already have it, update it
|
||||
if opds_feed:
|
||||
create_opds_post(post_dir, entry)
|
||||
else:
|
||||
create_post(post_dir, entry)
|
||||
existing_posts.remove(
|
||||
post_name
|
||||
) # create list of posts which have not been returned by the feed
|
||||
post_dir = os.path.join(output_dir, feed_name, post_name)
|
||||
|
||||
for post in existing_posts:
|
||||
# remove blog posts no longer returned by the RSS feed
|
||||
print("deleted", post)
|
||||
shutil.rmtree(os.path.join(feed_dir, slugify(post)))
|
||||
if post_name not in existing_posts:
|
||||
# if there is a blog entry we dont already have, make it
|
||||
if opds_feed:
|
||||
create_opds_post(post_dir, entry)
|
||||
else:
|
||||
create_post(post_dir, entry)
|
||||
|
||||
elif post_name in existing_posts:
|
||||
# if we already have it, update it
|
||||
if opds_feed:
|
||||
create_opds_post(post_dir, entry)
|
||||
else:
|
||||
create_post(post_dir, entry)
|
||||
existing_posts.remove(
|
||||
post_name
|
||||
) # create list of posts which have not been returned by the feed
|
||||
|
||||
|
||||
for post in existing_posts:
|
||||
# remove blog posts no longer returned by the RSS feed
|
||||
post_dir = os.path.join(output_dir, feed_name, post)
|
||||
shutil.rmtree(post_dir)
|
||||
print("deleted", post_dir)
|
||||
else:
|
||||
print(feed_url, "is not or no longer a feed!")
|
||||
|
||||
end = time.time()
|
||||
|
@ -60,6 +60,21 @@ def download_media(post_directory, media_attachments):
|
||||
with open(os.path.join(post_directory, image), "wb") as img_file:
|
||||
shutil.copyfileobj(response.raw, img_file)
|
||||
print("Downloaded cover image", image)
|
||||
elif item["type"] == "video":
|
||||
video = localize_media_url(item["url"])
|
||||
if not os.path.exists(os.path.join(post_directory, video)):
|
||||
# download video file
|
||||
response = requests.get(item["url"], stream=True)
|
||||
with open(os.path.join(post_directory, video), "wb") as video_file:
|
||||
shutil.copyfileobj(response.raw, video_file)
|
||||
print("Downloaded video in post", video)
|
||||
if not os.path.exists(os.path.join(post_directory, "thumbnail.png")):
|
||||
#download video preview
|
||||
response = requests.get(item["preview_url"], stream=True)
|
||||
with open(os.path.join(post_directory, "thumbnail.png"), "wb") as thumbnail:
|
||||
shutil.copyfileobj(response.raw, thumbnail)
|
||||
print("Downloaded thumbnail for", video)
|
||||
|
||||
|
||||
|
||||
def create_post(post_directory, post_metadata):
|
||||
@ -78,7 +93,6 @@ def create_post(post_directory, post_metadata):
|
||||
post_metadata["account"]["display_name"] = name
|
||||
env.filters["localize_media_url"] = localize_media_url
|
||||
env.filters["filter_mastodon_urls"] = filter_mastodon_urls
|
||||
|
||||
template = env.get_template("hashtag.md")
|
||||
|
||||
with open(os.path.join(post_directory, "index.html"), "w") as f:
|
@ -2,7 +2,7 @@
|
||||
title: "{{ event.name }}"
|
||||
date: "{{ event.begin }}" #2021-06-10T10:46:33+02:00
|
||||
draft: false
|
||||
categories: "calendar"
|
||||
source: "lumbung calendar"
|
||||
event_begin: "{{ event.begin }}"
|
||||
event_end: "{{ event.end }}"
|
||||
duration: "{{ event.duration }}"
|
@ -3,10 +3,11 @@ title: "{{ frontmatter.title }}"
|
||||
date: "{{ frontmatter.date }}" #2021-06-10T10:46:33+02:00
|
||||
draft: false
|
||||
summary: "{{ frontmatter.summary }}"
|
||||
authors: {% if frontmatter.author %} ["{{ frontmatter.author }}"] {% endif %}
|
||||
contributors: {% if frontmatter.author %} ["{{ frontmatter.author }}"] {% endif %}
|
||||
original_link: "{{ frontmatter.original_link }}"
|
||||
feed_name: "{{ frontmatter.feed_name}}"
|
||||
categories: ["{{ frontmatter.card_type }}", "{{ frontmatter.feed_name}}"]
|
||||
card_type: "{{ frontmatter.card_type }}"
|
||||
sources: ["{{ frontmatter.feed_name}}"]
|
||||
tags: {{ frontmatter.tags }}
|
||||
{% if frontmatter.featured_image %}featured_image: "{{frontmatter.featured_image}}"{% endif %}
|
||||
---
|
27
konfluks/templates/hashtag.md
Normal file
27
konfluks/templates/hashtag.md
Normal file
@ -0,0 +1,27 @@
|
||||
---
|
||||
date: {{ post_metadata.created_at }} #2021-06-10T10:46:33+02:00
|
||||
draft: false
|
||||
contributors: ["{{ post_metadata.account.display_name }}"]
|
||||
avatar: {{ post_metadata.account.avatar }}
|
||||
title: {{ post_metadata.account.display_name }}
|
||||
tags: [{% for i in post_metadata.tags %} "{{ i.name }}", {% endfor %}]
|
||||
images: [{% for i in post_metadata.media_attachments %}{% if i.type == "image" %}"{{ i.url | localize_media_url }}", {%endif%}{% endfor %}]
|
||||
videos: [{% for i in post_metadata.media_attachments %}{% if i.type == "video" %}"{{ i.url | localize_media_url }}", {%endif%}{% endfor %}]
|
||||
---
|
||||
|
||||
{% for item in post_metadata.media_attachments %}
|
||||
{% if item.type == "image" %}
|
||||
<img src="{{item.url | localize_media_url }}" alt="{{item.description}}">
|
||||
{% endif %}
|
||||
{% endfor %}
|
||||
|
||||
{% for item in post_metadata.media_attachments %}
|
||||
{% if item.type == "video" %}
|
||||
<video controls width="540px" preload="none" poster="thumbnail.png">
|
||||
<source src="{{item.url | localize_media_url }}" type="video/mp4">
|
||||
{% if item.description %}{{item.description}}{% endif %}
|
||||
</video>
|
||||
{% endif %}
|
||||
{% endfor %}
|
||||
|
||||
{{ post_metadata.content | filter_mastodon_urls }}
|
@ -3,10 +3,10 @@ title: "{{ frontmatter.title }}"
|
||||
date: "{{ frontmatter.date }}" #2021-06-10T10:46:33+02:00
|
||||
draft: false
|
||||
summary: "{{ frontmatter.summary }}"
|
||||
authors: {% if frontmatter.author %} ["{{ frontmatter.author }}"] {% endif %}
|
||||
contributors: {% if frontmatter.author %} ["{{ frontmatter.author }}"] {% endif %}
|
||||
original_link: "{{ frontmatter.original_link }}"
|
||||
feed_name: "{{ frontmatter.feed_name}}"
|
||||
categories: ["timeline", "{{ frontmatter.feed_name}}"]
|
||||
sources: ["timeline", "{{ frontmatter.feed_name}}"]
|
||||
timelines: {{ frontmatter.timelines }}
|
||||
hidden: true
|
||||
---
|
@ -6,9 +6,10 @@ uuid: "{{v.uuid}}"
|
||||
video_duration: "{{ v.duration | duration }} "
|
||||
video_channel: "{{ v.channel.display_name }}"
|
||||
channel_url: "{{ v.channel.url }}"
|
||||
contributors: ["{{ v.account.display_name }}"]
|
||||
preview_image: "{{ preview_image }}"
|
||||
images: ["./{{ preview_image }}"]
|
||||
categories: ["tv","{{ v.channel.display_name }}"]
|
||||
sources: ["{{ v.channel.display_name }}"]
|
||||
is_live: {{ v.is_live }}
|
||||
---
|
||||
|
@ -1,16 +0,0 @@
|
||||
---
|
||||
date: {{ post_metadata.created_at }} #2021-06-10T10:46:33+02:00
|
||||
draft: false
|
||||
authors: ["{{ post_metadata.account.display_name }}"]
|
||||
avatar: {{ post_metadata.account.avatar }}
|
||||
categories: ["shouts"]
|
||||
images: [{% for i in post_metadata.media_attachments %} {{ i.url }}, {% endfor %}]
|
||||
title: {{ post_metadata.account.display_name }}
|
||||
tags: [{% for i in post_metadata.tags %} "{{ i.name }}", {% endfor %}]
|
||||
---
|
||||
|
||||
{% for item in post_metadata.media_attachments %}
|
||||
<img src="{{item.url | localize_media_url }}" alt="{{item.description}}">
|
||||
{% endfor %}
|
||||
|
||||
{{ post_metadata.content | filter_mastodon_urls }}
|
@ -1,9 +1,9 @@
|
||||
[tool.poetry]
|
||||
name = "lumbunglib"
|
||||
name = "konfluks"
|
||||
version = "0.1.0"
|
||||
description = "Python lib which powers lumbung[dot]space automation"
|
||||
authors = ["rra", "decentral1se"]
|
||||
license = "GPLv3+"
|
||||
description = "Brings together small and dispersed streams of web content from different applications and websites together in a single large stream."
|
||||
authors = ["rra", "decentral1se", "knoflook"]
|
||||
license = "AGPLv3+"
|
||||
|
||||
[tool.poetry.dependencies]
|
||||
python = "^3.9"
|
||||
@ -25,8 +25,8 @@ requires = ["poetry-core>=1.0.0"]
|
||||
build-backend = "poetry.core.masonry.api"
|
||||
|
||||
[tool.poetry.scripts]
|
||||
lumbunglib-cal = "lumbunglib.cloudcal:main"
|
||||
lumbunglib-vid = "lumbunglib.video:main"
|
||||
lumbunglib-feed = "lumbunglib.feed:main"
|
||||
lumbunglib-timeline = "lumbunglib.timeline:main"
|
||||
lumbunglib-hash = "lumbunglib.hashtag:main"
|
||||
konfluks-cal = "konfluks.calendars:main"
|
||||
konfluks-vid = "konfluks.video:main"
|
||||
konfluks-feed = "konfluks.feed:main"
|
||||
konfluks-timeline = "konfluks.timeline:main"
|
||||
konfluks-hash = "konfluks.hashtag:main"
|
||||
|
17
setup.py
17
setup.py
@ -2,10 +2,10 @@
|
||||
from setuptools import setup
|
||||
|
||||
packages = \
|
||||
['lumbunglib']
|
||||
['konfluks']
|
||||
|
||||
package_data = \
|
||||
{'': ['*'], 'lumbunglib': ['templates/*']}
|
||||
{'': ['*'], 'konfluks': ['templates/*']}
|
||||
|
||||
install_requires = \
|
||||
['Jinja2>=3.0.3,<4.0.0',
|
||||
@ -20,14 +20,14 @@ install_requires = \
|
||||
'requests>=2.26.0,<3.0.0']
|
||||
|
||||
entry_points = \
|
||||
{'console_scripts': ['lumbunglib-cal = lumbunglib.cloudcal:main',
|
||||
'lumbunglib-feed = lumbunglib.feed:main',
|
||||
'lumbunglib-timeline = lumbunglib.timeline:main',
|
||||
'lumbunglib-hash = lumbunglib.hashtag:main',
|
||||
'lumbunglib-vid = lumbunglib.video:main']}
|
||||
{'console_scripts': ['konfluks-cal = konfluks.calendars:main',
|
||||
'konfluks-feed = konfluks.feed:main',
|
||||
'konfluks-timeline = lumbunglib.timeline:main',
|
||||
'konfluks-hash = konfluks.hashtag:main',
|
||||
'konfluks-vid = konfluks.video:main']}
|
||||
|
||||
setup_kwargs = {
|
||||
'name': 'lumbunglib',
|
||||
'name': 'konfluks',
|
||||
'version': '0.1.0',
|
||||
'description': 'Python lib which powers lumbung[dot]space automation',
|
||||
'long_description': None,
|
||||
@ -45,4 +45,3 @@ setup_kwargs = {
|
||||
|
||||
|
||||
setup(**setup_kwargs)
|
||||
|
||||
|
Reference in New Issue
Block a user