Over the holiday break I was looking at one of my old projects, ยตLinux. Turns out I did a fine job realy and have decided to revive the project ๐ฅณ -- Just getting the build/tests woring on my Mac Studio (Apple Silicon). Check it out! ๐ #ยตLinux
"Problems are Solved by Method" ๐ฆ๐บ๐จโ๐ป๐จโ๐ฆฏ๐นโ ๐โฏ ๐จโ๐ฉโ๐งโ๐ง๐ฅ -- James Mills (operator of twtxt.net / creator of Yarn.social ๐งถ)
Was just catching up on all the LinkedIN garbage that is well umm garbage ๐๏ธ One was from a candidate I interviewed, so I had to reply to that ๐ -- Anyway.... Saw this random post in my "notifications":
How do land that job with a Unicorn
First off, you'll have to define what da fuq a "Unicorn" is! ๐คฃ My understanding a Unicorn is a mythical creature with a horn on its head and wings ๐ชฝ ๐คฆโโ๏ธ
Today we got to explore the Imperial City of Hue amongst other places.
Problem 2: Your SSD-backed database has a usage-pattern that rewards you with a 80% page-cache hit-rate (i.e. 80% of disk reads are served directly out of memory instead of going to the SSD). The median is 50 distinct disk pages for a query to gather its query results (e.g. InnoDB pages in MySQL). What is the expected average query time from your database?
Share your solution via Twtxt and how you arrived at it and I'll share my solution tomorrow!
napkin-math
New name for a new political party:
Country Uniting Nationally Together
I'm considering becoming a gold or platinum sponsor of the ladybird project
It's hilarious seeing rats nest like this ๐คฃ in Vietnam's Hoi An! ๐คฃ Thailand has very similar, not sure which is worse ๐
Problem 1: How much will the storage of logs cost for a standard, monolithic 100,000 RPS web application?
napkin-math
The beach itself is very nicely maintained on a daily basis however, unfortunately the sea is full of plastic and rubbish ๐ข
This whole bitcoin thing is just fucking crazy right? ๐คฃ
Can someone try Alpine Linux with XFCE and Compiz please? Show me how the full screen zoom works in 2024/2025 ๐
This resort, Angsana Lang Co has a big huge canal that goes right around the resort!
And look the wife and I got a welxome message on our bed ๐คฃ
Not much to see in an airport in Taiwan I'm afraid @bender ๐คฃ And we had a very tight connecting flight!
This bloody inflight.pacwisp.net is Uber expensive haha ๐ I have 4MB left poo ๐ฉ
followers
field is deprecated ==> https://git.mills.io/yarnsocial/twtxt.dev/pulls/6
Please sign this and share ๐ https://www.change.org/p/oppose-australia-s-proposed-social-media-ban-for-under-16s
Speaking of delicious pizzas, had this nice very sliced Beatty on the weekend! โจ๏ธ๐ถ๏ธ
This is what you can see at some local cafes around here ๐คฃ
Starting the call: https://meet.mills.io/call/Yarn.social
Come join us!
Wow! Just Wow! ๐ฎ Discovered this whilst trying to debug why my Youtube frontend no longer works:
$ youtube-dl 'https://www.youtube.com/watch?v=YpiK1FMy2Mg'
[youtube] YpiK1FMy2Mg: Downloading webpage
WARNING: unable to extract uploader id; please report this issue on https://yt-dl.org/bug . Make sure you are using the latest version; see https://yt-dl.org/update on how to update. Be sure to call youtube-dl with the --verbose flag and include its complete output.
ERROR: unable to download video data: HTTP Error 403: Forbidden
```<iframe loading="lazy" src="https://www.youtube.com/embed/YpiK1FMy2Mg" class="embed-video" allow="encrypted-media" title="" allowfullscreen="allowfullscreen" frameborder="0"></iframe>
@movq Were you going to add Jenny here? https://twtxt.dev/clients.html
Far out, the new Mac Mini is actually cheaper than one from several years ago ๐ฑ
One of the things I'm going to work on next (maybe today, we'll see how much time there's left in the day) is being able to load up old conversations (fallen off the cache) like this one.
This pod is now using the index for archive twts instead of the old (naive) disk-based index with that results in millions of files over a long time ๐คฃ
Anyone actively use filters here? ๐ค If so which ones? If not, why? Any more useful than others? ๐ค
I want to become a solopreneur ๐คฃ Build a service, or a set of small services with enough customers (not too many) where they provide enough revenue and receive the right amount of value for service that I can just do that.
๐ Reminder folks of the upcoming Yarn.social monthly online meetup:
- Event: Yarn.social Online Meetup
- When: 23rd November 2024 at 12:00PM UTC (midday)
- Where: Mills Meet : Yarn.social
Yarn.social #Meetup
The Australia Labor government, Albanese and the honourable Michelle Rowland federal member for parliament and communications minster are fucking clowns. It's stupid shit like this that's the real problem with "big tech" social media platforms. These morons just simply don't understand basic economics and basic business.
Why would a company like Meta, X and TikTok give up a large multi-billion dollar segment of the market. That is, young children from the ages of ~3 to 16 (yes kids these days can use a computer or device from a pretty young age!)
The whole masquerade of "online saftey" and the new Australia legislation of the Online Safety Act 2021 is complete and utter bullshit.
You wanna fix this whole cybercrime and cyber bullying that goes on (which btw if you understood how these fucking platforms worked in the first place, you'd realise drives up engagement on the platforms by abusing human emotional and psychological weakness), then ban and make illegal with multi-Billion dollar fines the following:
- Profiting off data collected by users on your platform(s)
- Categorizing users on your platform and performing A/B tests
- Targeting users (of any age) for advertising
In fact just ban targeted advertising period.
After several hard hours, I think I've recovered the last missing 1
Twt from @bender ๐ฅณ Turns out just before I accidentally nuked my pod, I took a dump of it's cached just seconds before ๐คฃ -- So I was also able to rebuild anything that was missing from the backup from the recent cache dump!
The web is such garbage these days ๐ Or is it the garbage search engines? ๐ค
I need money for my mother's heart surgery, there is a shortfall of about 2 million rupiah from a total of 40 million, can you help me with any amount?
Hmmm ๐ง
FYI ๐ I will be deleting the following inactive users from my pod (twtxt.net) soonโข:
$ ./tools/inactive_users.sh 730
@thgie last seen 732 days ago @will last seen 740 days ago @shaneflores last seen 752 days ago @magnus last seen 757 days ago @nickmellor last seen 757 days ago @birb last seen 763 days ago @screem last seen 772 days ago @servusdei last seen 774 days ago @alex last seen 790 days ago @andreottica last seen 801 days ago @fox last seen 822 days ago @anx last seen 829 days ago @olav last seen 855 days ago @caesar last seen 866 days ago @jim last seen 869 days ago @rell last seen 882 days ago @readfog last seen 886 days ago
If anyone on this lists sees this post and wishes to preserve their feed/account for some reason (beyonds backups I maintain), please login at least once over the next coming weeks to get off this list. I will re-run this tool again, and then nuke blindly anything that matches >730 days of inactivity.
So let's recap... We've got Putin waging war against Ukraine. Netanyahu waging war against Palestine. Iran getting involved. Kim Jong Un helping Russia and sending soldiers as resources for Putin's war. And now Trump has won a 2nd term in the US where we'll see him scrap EU sanctions and fines against US companies violating EU laws and what else? ๐ค
What dafuq is wrong with this world?! ๐
@wbknl Btw you don't need to mention yourself when composing a new Twt (I think maybe you're doing it from your profile view?) Just expand the box at the top of the Timeline or Discover views.
๐ PR to propose Feed Format Extension -- Request for comment ๐
Neycer Robalino vs Hayden Green โ Brisbane Flexi Season (Week 3) Div 1 Final - YouTube This is Neycer one of our coaches at the table-tennis club ๐ that I play at vs. Hayden a top-rated QLD player (well not anymore ๐คฃ). What a match! ๐ฑ Go #Brisbane #Table-Tennis #BTTA
๐ FYI: I've put in place 301 Moved Permanently
redirect(s) for https://dev.twtxt.net/ and all relevant pages to the new domain https://twtxt.dev ๐
@gallowsgryph do you mind updating the fragment part of your avatar url? ๐
My very strong opinion on the use of Twtxt is if you intend to use it, you should be prepared to let people pull your feed or at least check it and regular rentals.
Otherwise get out and go use something that's either a distributed (Mastodon, AT, etc) or centralized (Facebook, X, etc) network.
After the behaviour of a clearly very angry feed author over the past few days, I'm very tempted to give up on Twtxt and allow it to go back to being dead. What really is the point of building and supporting a way to exchange little pieces of text with one another in a completely decentralized way, if you're just going to keep bumping up against such hostility? I don't know why I do this anymore.
The real crux of the matter is this whole moving feeds around to different uri(s). This makes things hard. I think it's worth revisiting @anth 's UUID idea for its merits.
Ya know; Rather than being an asshole and getting all angry, just be reasonable and reach out to the community or folks fetching (or trying) your feed.
Most clients respect caching if your feed is transported I've HTTP.
Otherwise you can add the # refresh
hint to clients on your feed.
No need to be an obnoxious ass and flood your own feed. That will just get you permanarely unfollowed and ignored.
Offen Fair Web Analytics This looks pretty good., might give this a try. Been using GoatCounter, but it's pretty bland in that it doesn't really tell me much ๐
๐ Reminder folks of the upcoming Yarn.social monthly online meetup:
- Event: Yarn.social Online Meetup
- When: 26th October 2024 at 12:00PM UTC (midday)
- Where: Mills Meet : Yarn.social
Yarn.social #Meetup
@asquare By the way... It might be nice to set yourself up with an Avatar ๐
@aelaraji So, what's your salty addr? I tried to guess it by doing a lookup, but I guess I didn't guess right ๐
The WordPress ecosystem has lost its mindโฆ - YouTube ๐ This is a pretty good summary of how fucked up the Wordpress ecosystem is now thanks to Mat ๐คฆโโ๏ธ (not that I've ever used Wordpress uggh ๐ฉ)
I can't decide which DCDC charger to. buy for my Camper trailer. Help me! ๐ Currently it's a choice between:
- KickAss 12V/24V 25A DCDC Charger With Solar MPPT + Pre-Wired Anderson
- iTECHDCDC25 12V/24V 25A DCDC & MPPT Battery Charger
- Renogy DCC30S 12V 30A Dual Input DC to DC Battery Charger with MPPT
The only advantage of the Renogy over the KickAss/ITech models is it has Bluetooth monitoring and an App capabilities so you can check the state of the battery/charging/etc from your phone.
Over the past few days I've been playing around with the latest Chat-GPT, I think the model is called o1-preview
. I've used it for various tasks from writing documentation, specs, shell scripts, to code (in Go).
The result? Well I can certainly say the model(s) are much better than they used to be, but maybe that isn't so much the models per se, but the sheer processing power at OpenAI's data centers? ๐ค
But here's the kicker though... If anyone ever for a moment ever think that these "AI" things are intelligent, or that the marketing and hype is ever remotely close to trying to convince of us this "AGI" (Artificial General Intelligence) or ASI (Artificial Super Intelligence), you are sorely mistaken.
Chat-GPT and basically and any other technology based on Generative-AI (Gen-AI), these pre-trained transformers that use adversarial neural networks and insanely multi-dimensional vector databases to model all sorts of things from human language, programming languages all the way to visual and audible art are (wait for it):
Incredibly stupid! ๐คฆโโ๏ธ
They are effectively quite useless for anything but:
- Reproducing patterns (albieit badly)
- Search and Retrieval (in a way that "seems" to be natural)
And that's about it.
Used as a tool, they're kind of okay, but I wouldn't use Chat-GPT or CoPilot. I'd stick with something more like Codeium if you want a bit of a fancier "auto complete". Otherwise, just forget about the whole thing honestly. It doesn't even really save you time.
If we stuck with Blake2b for Twt Hash(es); what do we think we need to reasonably go to in bit length/size?
=> https://gist.mills.io/prologic/194993e7db04498fa0e8d00a528f7be6
e.g: (turns out @xuu is right about Blak2b being easy/simple too!):
$ printf "%s\t%s\t%s" "https://example.com/twtxt.txt" "2024-09-29T13:30:00Z" "Hello World!" | b2sum -l 32 -t | awk '{ print $1 }'
7b8b79dd
I am told through various sources that Iran decided last night to attack Israel with over 200 missile strikes in response to Israel attacking Lebanon. ๐ค
@movq I'd love it if you write up a page for jenny ๐ at https://twtxt.dev ๐ค
I'm looking to develop a static site for twtxt.dev -- A domain I own and have wanted to use for developer and specification docs for Twtxt.
Can anyone recommend a few Hugo themes you like?
All of the dev.twtxt.net content would move over as well.
๐ Thanks for joining us on our Sept monthly Yarn.social meetup today y'all ๐โโ๏ธ We had @david @sorenpeter @doesnm @falsifian and @xuu ๐ช Nice turn out! (not all at once of course, as we normally run this over 4 hours as we span many time zones!)
Things we talked about:
- Decentralised vs. Distributed
- Use of SHA256 for Twt Hash(es)
- We solved Edits! ๐ฅณ
- UUID(s) probably won't work! (susceptible to sppofing)
- Helped @sorenpeter write some PHP to process/parse
User-Agent
and service his feed via a custom PHP script ๐ - @falsifian introduced himself ๐
- Talked about Merkle Trees ๐ณ
Did I miss anything? ๐ค
Summary of Discussions (as best I can):
- @lyse and @sorenpeter express simplicity. Both Lyse and Sorenpeter support location-based addressing.
- @falsifian believes we should continue to develop ideas and extensions progressively over time like we've always done.
- @david @quark and @bender would like a better user experience, especially when threads break due to edits, deletions or feed location changes.
- @anth would like to see utf-8 mandated, and the threading model remain largely the same as it is today, which is primarily based on the convention of a Twt Subject anyway, Twt Hash(es) just make the threading "more precise". Anth also states that format, client and server specification/recommendations should be kept separate.
- @movq @xuu sorry you two haven't said too much really, so I'm not too sure?
Overall, the 22 votes we've had on the poll from the community (if you can call it a community?) have clearly shown that:
- We continue to support content-based addressing. (65/35)
- We think about formally supporting edits/deletes (60/40)
- We do not increase the use of cryptography (thworing things like authenticity and identity out the window) (70/30)
And overall the NPS (net promoter score) of "Would I recommend Twtxt to a friend" is a whopping 7/10 (which is crazy! ๐คฏ)
Let's have our monthly catch up soonโข (1hr) and discuss together. My own take on the direction we should take at this point is as follows:
- We continue to use hashing for the threading model.
- We think about changing this to SHA-256 for simplicity.
- We either adopt @anth's UUID approach or @lyse Dynamic URL approach.
- We continue to incrementally/progressively improve things over time as @falsifian suggested.
- We think about mandating utf-8 as @anth suggests which makes things so much easier for everyone.
- We further discuss the merits/ideas of supporting formal Edit/Delete requests or other ways to better support this in some way.
This Facebook/Meta story on storing passwords in plain text it just wow ๐ฎ -- Like how da fuq does a company, or anyone for that matter in the business of software / technology even do this?! Like at least base64 encode the fuckers right?! (oh wait ๐คฆโโ๏ธ)
Something @anth said on ITC
17:42
I should also note in there that it doesnโt address the two things i really want it to: mandate utf-8 (which should be easy to fit in) and something for better @ mentions.
I actually agree with in both counts and it got me thinking...
Sharing the comments of the poll (anonymous so I have no idea whom the comments are from):
your poll should include questions about markdown. personally i think inline bits like style, links, images are yes. block quotes, code blocks, bullet lists are mid. but tables and footnotes are no.
Yes sorry about this, I wasn't able to change much after publishing the poll ๐
I think there's a bug in yarnd
hwoever:
$ yarnc debug https://sunshinegardens.org/~xjix/twtxt/tw.txt
...
bqor23a 2024-09-26T11:09:28-07:00
Gemini/Gopher Twtxt feeds account for less than 1% in existence:
$ total=$(inspect-db yarns.db | jq -r '.Value.URL' | awk -F'//' '{if ($1 ~ /^https?/) print "http/https:"; else print $1}' | sort | uniq -c | awk '{sum+=$1} END {print sum}'); inspect-db yarns.db | jq -r '.Value.URL' | awk -F'//' '{if ($1 ~ /^https?/) print "http/https:"; else print $1}' | sort | uniq -c | awk -v total="$total" '{printf "%d %s %.2f%%\n", $1, $2, ($1/total)*100}' | sort -r
7 gemini: 0.66%
4 gopher: 0.38%
1046 http/https: 98.96%
โFor every complex problem, there is a solution that is clear, simple, and wrong.โ
-- H.L. Mencken
"Everything should be made as simple as possible, but not simpler." โ Albert Einstein
The beauty of simplicity lies in not losing the essence.
simplicity #Einstein #wisdom
Don't forget about the upcoming Yarn.social monthly online meetup. See #jjbnvgq for details.
Last day to have your say before our monthly online meetup ๐
Hmm this question has a leading "Yes" in favor of so far with 13 votes:
Should we formally support edit and deletion requests?
Thanks y'all for voting (it's all anonymous so I have no idea who's voted for what!)
If you haven't already had your say, please do so here: http://polljunkie.com/poll/xdgjib/twtxt-v2 -- This is my feeble attempt at trying to ascertain the voice of the greater community with ideas of a Twtxt v2 specification (which I'm hoping will just be an improved specification of what we largely have already built to date with some small but important improvements ๐ค)
Starting a couple of new projects (geez where do I find the time?!):
HomeTunnel:
HomeTunnel is a self-hosted solution that combines secure tunneling, proxying, and automation to create your own private cloud. Utilizing Wireguard for VPN, Caddy for reverse proxying, and Traefik for service routing, HomeTunnel allows you to securely expose your home network services (such as Gitea, Poste.io, etc.) to the Internet. With seamless automation and on-demand TLS, HomeTunnel gives you the power to manage your own cloud-like environment with the control and privacy of self-hosting.
CraneOps:
craneops is an open-source operator framework, written in Go, that allows self-hosters to automate the deployment and management of infrastructure and applications. Inspired by Kubernetes operators, CraneOps uses declarative YAML Custom Resource Definitions (CRDs) to manage Docker Swarm deployments on Proxmox VE clusters.
Don't forget about the upcoming Yarn.social online meetup coming up this Saturday! ๐ See #jjbnvgq for details! -- Hope to see y'all there ๐ช
๐ Don't forget to take the Twtxt v2 poll ๐ if you haven't done so already (sorry about the confusing question at the end!)
Don't forget about the upcoming Yarn.social meetup coming up this Saturday! See #jjbnvgq for details! Hope to see some/all of y'all there ๐ช
Just out of curiosity, I inspected the yarns database (the search engine//cralwer) to find the average length of a Twtxt URI:
$ inspect-db yarns.db | jq -r '.Value.URL' | awk '{ total += length; count++ } END { if (count > 0) print total / count }'
40.3387
Given an RFC3339 UTC timestamp has a length of 20 characters with seconds precision. We're talking about Twt Subject taking up ~63 characters/bytes on average.
Reminder to take the Twtxt (anonymous) Poll: http://polljunkie.com/poll/xdgjib/twtxt-v2
Apologies, I can't edit the poll once it's live, so the suggestion on feedback for supporting Markdown will have to be discussed at another time.
So I whipped up a quick shell script to demonstrate what I mean by the increase in feed size on average as well as the expected increase in storage and retrieval requirements.
$ ./compare.sh
Original file size: 28145 bytes
Modified file size: 70672 bytes
Percentage increase in file size: 151.10%
...
One of the reasons we wanted to originally use Contant based addressing and short hashes as our threading model was to keep individual Twts short so that they were still readable if you viewed the manually by hand.
With the proposal to switch to location based addressing using a pointer to a feed and a timestamp in that feed you're looking at roughly 2025 characters long because both the HTTP and HTML and even URI specifications do not specify maximum length for URI(s) AFAIK only recommendations.
Another interesting side effect of changing from content-based addressing to location-based addressing is that switching from 7-byte keys to 2025-character keys for 3.5 million entries would expand the database size from 24.5 MB to about 7.09 GBโan increase of roughly 7.06 GB!
So I'm a location based system, how exactly do I reply to one of these two Twts from @Yarns ? ๐ค
2024-09-07T12:55:56Z
Okay folks, I've spent all day on this today, and I think its in "good enough"โข shape to share:
Twtxt v2:
- Specification: https://docs.mills.io/uJXuisaYTRWYDrl8A2jADg?both
- implementation: https://gist.mills.io/prologic/afdec15443da4d7aa898f383f171ec1b
LOl ๐ Not only have a tried to write up a full Twtxt v2 specification, I've also written a Bash shell script that implements the new spec ๐
๐ Reminder folks of the upcoming Yarn.social monthly online meetup:
I hope to see @david @movq @lyse @xuu @sorenpeter and hopefully others too @aelaraji @falsifian and anyone else that sees this! ๐ We're hopefully going to primarily discuss the future of Twtxt and the last few weeks of discussions ๐คฃ
- Event: Yarn.social Online Meetup
- When: 28th September 2024 at 12:00pm UTC (midday)
- Where: Mills Meet : Yarn.social
- Cadence: 4th Saturday of every Month
Agenda:
- Let's talk about the upcoming changes to the Twtxt spec(s)
- See #xgghhnq
Yarn.social #Meetup
My Position on the last few weeks of Twtxt spec discussions:
- We increase the Hash length from
7
to11
. - We formalise the Update Commands extension.
- We amend the Twt Hash and Metadata extension to state:
Feed authors that wish to change the location of their feed (once Twts have been published) must append a new
# url =
comment to their feed to indicate the new location and thus change the "Hashing URI" used for Twts from that point onward.
This has implications of the "order" of a feed, and we should either do one of two things, either:
- Mandate that feeds are append-only.
- Or amend the Metadata spec with a new field that denotes the order of the feed so clients can make sense of "inline" comments in the feed. -- This would also imply that the default order is (of course) append-only. Suggestion:
# direction = [append|prepend]
I finally decided to do a few experiments with yarnd
to see how many things would break and how many assumptions there are around the idea of "Content Addressing"; here's where I'm at so far:
Basically I'm at a point where spending time on this is going to provide very little value, there are assumptions made in the lextwt parser, assumptions made in yarnd, assumptions in the way storage is done and the way threading works and things are looked up. There are far reaching implications to changing the way Twts are identified here to be "location addressed" that I'm quite worried about the amount of effort would be required to change yarnd
here.
Ever wondered what it would cost to self-hosted vs. use the cloud? Well I often doubt myself every time I look at hardware prices, and I know I have to do some hardware refresh soonโข for the Mills DC (something I don't have a regular plan or budget for), here's a rough ball park:
The Mills DC has cost me around ~$15k to build and maintain over the last ~10 years or so. Roughly speaking. I've never actually taken a Bill of Materials or anything, but I could if anyone is interested in more specifics.
The equivalent of resources if run in the "Cloud" would cost around:
- ~$1,000 for virtual machines
- ~$12000 for storage
So around ~$2,000/month to run.
Keep this in mind anytime anyone ever tries to con you into believing "Cloud is cheaper". It's not.
Bahahahaha very clever @lyse I look forward to reading your report ! ๐คฃ However...
$ yarnc debug https://twtxt.net/user/prologic/twtxt.txt | grep -E '^pqst4ea' | tee | wc -l
0
I very quickly proved that Twt was never from me ๐คฃ
๐ Reminder that next Saturday 28th September will be out monthly online meetup! Hope to see some/all of you there ๐
Can I get someone like maybe @xuu or @abucci or even @eldersnake -- If you have some spare time -- to test this yarnd
PR that upgrades the Bitcask dependency for its internal database to v2? ๐
VERY IMPORTANT If you do; Please Please Please backup your yarn.db
database first! ๐
Heaven knows I don't want to be responsible for fucking up a production database here or there ๐คฃ
Can someone much smarter than me help me figure out a couple of newly discovered deadlocks in yarnd
that I think have always been there, but only recently uncovered by the Go 1.23 compiler.
Speaking of AI tech (sorry!); Just came across this really cool tool built by some engineers at Googleโข (currently completely free to use without any signup) called NotebookLM ๐ Looks really good for summarizing and talking to document ๐
An alternate idea for supporting (properly) Twt Edits is to denoate as such and extend the meaning of a Twt Subject (which would need to be called something better?); For example, let's say I produced the following Twt:
2024-09-18T23:08:00+10:00
With a SHA1 encoding the probability of a hash collision becomes, at various k (number of twts):
>>> import math
>>>
>>> def collision_probability(k, bits):
... n = 2 ** bits # Total unique hash values based on the number of bits
... probability = 1 - math.exp(- (k ** 2) / (2 * n))
... return probability * 100 # Return as percentage
...
>>> # Example usage:
>>> k_values = [100000, 1000000, 10000000]
>>> bits = 44 # Number of bits for the hash
>>>
>>> for k in k_values:
... print(f"Probability of collision for {k} hashes with {bits} bits: {collision_probability(k, bits):.4f}%")
...
Probability of collision for 100000 hashes with 44 bits: 0.0284%
Probability of collision for 1000000 hashes with 44 bits: 2.8022%
Probability of collision for 10000000 hashes with 44 bits: 94.1701%
>>> bits = 48
>>> for k in k_values:
... print(f"Probability of collision for {k} hashes with {bits} bits: {collision_probability(k, bits):.4f}%")
...
Probability of collision for 100000 hashes with 48 bits: 0.0018%
Probability of collision for 1000000 hashes with 48 bits: 0.1775%
Probability of collision for 10000000 hashes with 48 bits: 16.2753%
>>> bits = 52
>>> for k in k_values:
... print(f"Probability of collision for {k} hashes with {bits} bits: {collision_probability(k, bits):.4f}%")
...
Probability of collision for 100000 hashes with 52 bits: 0.0001%
Probability of collision for 1000000 hashes with 52 bits: 0.0111%
Probability of collision for 10000000 hashes with 52 bits: 1.1041%
>>>
If we adopted this scheme, we could have to increase the no. of characters (first N) from 11
to 12
and finally 13
as we approach globally larger enough Twts across the space. I think at least full crawl/scrape it was around ~500k (maybe)? https://search.twtxt.net/ says only ~99k
Taking the last n characters of a base32 encoded hash instead of the first n can be problematic for several reasons:
-
Hash Structure: Hashes are typically designed so that their outputs have specific statistical properties. The first few characters often have more entropy or variability, meaning they are less likely to have patterns. The last characters may not maintain this randomness, especially if the encoding method has a tendency to produce less varied endings.
-
Collision Resistance: When using hashes, the goal is to minimize the risk of collisions (different inputs producing the same output). By using the first few characters, you leverage the full distribution of the hash. The last characters may not distribute in the same way, potentially increasing the likelihood of collisions.
-
Encoding Characteristics: Base32 encoding has a specific structure and padding that might influence the last characters more than the first. If the data being hashed is similar, the last characters may be more similar across different hashes.
-
Use Cases: In many applications (like generating unique identifiers), the beginning of the hash is often the most informative and varied. Relying on the end might reduce the uniqueness of generated identifiers, especially if a prefix has a specific context or meaning.
In summary, using the first n characters generally preserves the intended randomness and collision resistance of the hash, making it a safer choice in most cases.
Current Twt Hash spec and probability of hash collision:
The probability of a Twt Hash collision depends on the size of the hash and the number of possible values it can take. For the Twt Hash, which uses a Blake2b 256-bit hash, Base32 encoding, and takes the last 7 characters, the space of possible hash values is significantly reduced.
Breakdown:
- Base32 encoding: Each character in the Base32 encoding represents 5 bits of information (since ( 2^5 = 32 )).
- 7 characters: With 7 characters, the total number of possible hashes is: [ 32^7 = 3,518,437,208 ] This gives about 3.5 billion possible hash values.
Probability of Collision:
The probability of a hash collision depends on the number of hashes generated and can be estimated using the Birthday Paradox. The paradox tells us that collisions are more likely than expected when hashing a large number of items.
The approximate formula for the probability of at least one collision after generating n
hashes is:
[
P(\text{collision}) \approx 1 - e^{-\frac{n^2}{2M}}
]
Where:
- (n) is the number of generated Twt Hashes.
- (M = 32^7 = 3,518,437,208) is the total number of possible hash values.
For practical purposes, here are some example probabilities for different numbers of hashes (n
):
- For 1,000 hashes: [ P(\text{collision}) \approx 1 - e^{-\frac{1000^2}{2 \cdot 3,518,437,208}} \approx 0.00014 \, \text{(0.014%)} ]
- For 10,000 hashes: [ P(\text{collision}) \approx 1 - e^{-\frac{10000^2}{2 \cdot 3,518,437,208}} \approx 0.14 \, \text{(14%)} ]
- For 100,000 hashes: [ P(\text{collision}) \approx 1 - e^{-\frac{100000^2}{2 \cdot 3,518,437,208}} \approx 0.999 \, \text{(99.9%)} ]
Conclusion:
- For small to moderate numbers of hashes (up to around 1,000โ10,000), the collision probability is quite low.
- However, as the number of Twts grows (above 100,000), the likelihood of a collision increases significantly due to the relatively small hash space (3.5 billion).
Just experimenting...
$ echo -n "https://twtxt.net/user/prologic/twtxt.txt\n2020-07-18T12:39:52Z\nHello World! ๐" | sha256sum | base64 | tr -d '=' | tail -c 12
NWY4MSAgLQo
I've been using Codeium too the last week or so ! It's pretty good and like @xuu said is a pretty desent Junior assistant, it helps me write good docs and the tab completion is amazing!
It of course completely sucks at doing anything "intelligent" or complex, but if you just use it as a fancier auto complete it's actually half way decent ๐
One thing that's on my mind over the last few days about all this Twt editing and identity stuff we've been having hot debates over is this...
I don't really have a problem with editing twts, or someone changing their feed's URL.
Personally I think the folks that do are rightfully pedantic and like a good user experience, which I don't blame 'em. I would expect the same too. Anyway, just wanted to get that out there, I believe we can support editing and identity in a way that is still simple, as long as we bring clients along for the ride with us. The old/legacy original client though will have to remain well, ya know ๐
@lyse and @movq and possibly @aelaraji and even @cuaxolotl -- I'm very curious to understand and hear thoughts, pros and cons or other feelings about introducing the notation of a feed's identify using cryptography? If we were to keep things simple, and use what's commonly available, for example SSH ED25519 keys? using the ssh-keygen -Y sign
or ssh-keygen -Y verify
tools already available? Maybe in combination with @xuu 's idea of generating a random unique ID for your feed, say # id =
and signing it with your ED25519 key? ๐
Summary of WiscKey: Separating Keys from Values in SSD-Conscious Storage
- Authors: Lanyue Lu, Thanumalayan Sankaranarayana Pillai, Andrea C. Arpaci-Dusseau, Remzi H. Arpaci-Dusseau
- Conference: 14th USENIX Conference on File and Storage Technologies (FAST '16)
- Key Concept: WiscKey is a key-value store that separates keys from values to minimize I/O amplification, optimizing performance for SSDs.
- Performance: WiscKey outperforms LevelDB and RocksDB in various workloads, achieving up to 111ร faster loading and improved random lookup speeds.
- Design Goals: Focus on low write/read amplification, SSD optimization, and support for modern features like range queries.
That's an interesting side effect to the new Discover feature that I added sometime ago that only displays one post per feed. That is when you're not logged in and viewing my pod's front page. You can pretty easily and roughly see what the monthly active view account is just by looking at the pager size. ๐ค
Even though we're quite a ways from any suburban areas, even with the Internet access via cell towers this poor, using my pod is still very snappy. ๐
Out camping with the family this weekend for my birthday ๐ฅณ
@xuu What's the keyoxide thingy you wrote/built? ๐ค What's your URI/profile? ๐ค
@cuaxolotl Did you recently change the url
metdata key of your feed?
# url = https://sunshinegardens.org/~xj9/twtxt/tw.txt
Was this at one point # url = https://sunshinegardens.org/users/xj9/twtxt/tw.txt
?
Spent the day performing backups (hadn't done it in a while ๐ฑ) and wrote a full backup definition internal document that defines my backup process, scope, security, frequency, backup locations, capacity and backup and restoration procedures. Very happy with the doc and the updated (now fully documented) plan and scheduled backup frequency (once per month, which I'll put into my calendar as it's done by hand for now, with tools). So far backing up ~410GB out of a possible ~12.8TB worth of data in two locations -- I deliberately don't backup everything as much of the data can be re-created (music, videos, tv shows, etc). #Backups #Data
Swa this pop up in my Github news feed today ๐ค Which links to https://github.com/musingstudio/go-subclub
A Go (golang) library for interacting with the sub.club API.
So I got curious and had a peek ๐
Let's fund the Fediverse
Posting or hosting on the open social networks no longer means you have to do it for free. Developer Preview now available.
And further down:
Monetize your feeds
If you post quality content and you've developed a loyal audience, you should be able to ask your most passionate followers to support you with a premium subscription.
That's a promise not available on the Fediverse ...until now.
Hmmm ๐ค
Introduction to JuiceFS | JuiceFS Document Center -- Thinking about using JuiceFS to solve a long-running problem I've always had.
- Be able to run services on any node in my cluster and let Docker Swarm pick whatever node it likes (instead of now where I have to pin some workloads to specific nodes, as that's where their local storage volume is)
- Manage the scalability of data and growth over time instead of what I do now which is to extend EXT4 filesystems on my Docker Swarm nodes every few years.
Anyone had any intereractions with @cuaxolotl yet? Or are they using a client that doesn't know how to detect clients following them properly? Hmmm ๐ง
It's a really good time to invest in nVIDIA shares ๐คฃ
@abucci appreciate it if you find the time to update again ๐
Time for workโข, But I quickly hacked together a bit of a better solution here. Rolling it out to my pod so we'll see how it actually goes. Still possible to abuse if you're a logged in user, etc, but at least now we delete the invalid/bad feed afterwards if it a) was not even a text//plain
content-type or b) it errored out and was a new fetch of a HTTP feed.
Wow! My god spammers really try hard song they? ๐คฃ Geez ๐คฆโโ๏ธ Do we need to make the captcha harder? ๐
My 9yr old daughter just made her Git commit today, her first website, setup two-factor authentication and used several credentials (which I helped her with) ๐คฃ -- next lessons: password hygiene/management.
โฐ for our monthly Yarn.social Online Meetup! ๐ค
-
Event: Yarn.social Online Meetup
-
When: 24th August 2024 at 12:00pm UTC (midday)
-
Cadence: 4th Saturday of every Month
-
Agenda:
Anything we want to talk about. Twtxt, Yarn, self hosting, cool stuff youโve been working on. chit-chat, whatever ๐
Yarn.social #Meetup
I just realized, this is the last Saturday of the month. So Yarn.social meetup is up again tomorrow. Same time as last time if anyone is interested/around to join and hang out!
Does anyone know what the differences between HTTP/1.1 HTTP/2 and HTTP/3 are? ๐ค
~2 years later...
Yeah I'm kind of glad they're better at Hardware too and not this (questionable) "social media" thing ๐คฃ #Mitre10 #Hardware #Social
Dear OnlyDomains, part of Team Internet. Do you think you could stop being so incompetent when it comes to Domains, DNS and basic HTTP? I reported this to you on Friday, and you are still arguing with me over Support the legitimatecy of the claims? Seriously?! ๐ง
$ dig @1.1.1.1 +short onlydomains.com.au a
198.50.252.65
$ nc -vvv 198.50.252.65 443
nc: connectx to 198.50.252.65 port 443 (tcp) failed: Connection refused
OnlyDomains
What a glorious morning for a public holiday ๐ช What shall I do today? Hmmm ๐ง
I like how tags like #reading now actually work correctly on Yarn pods ๐
Anyone recommend a domain registrar, that's only a domain registrar and nothing else? I'm not interesting in Email Hosting, Web Hosting, Parking, or whatever other silly nonsense. Just domain registration, delegation and renewal.
@falsifian by the way, on the last Saturday of every month, we generally hold a online video call/social meet up, where we just get together and talk about stuff if, you're interested in joining us this month.
Found out today, that the registrar that I use Only Domains's AU front door is DOWN. That is https://onlydomains.com.au
$ host -t A onlydomains.com.au
onlydomains.com.au has address 198.50.252.65
$ curl -v https://onlydomains.com.au/
* Trying 198.50.252.65:443...
* connect to 198.50.252.65 port 443 failed: Connection refused
* Failed to connect to onlydomains.com.au port 443 after 222 ms: Couldn't connect to server
* Closing connection
curl: (7) Failed to connect to onlydomains.com.au port 443 after 222 ms: Couldn't connect to server
OTS works Soo great! ๐ Juat got my mother to use it to share some creds so I could take over her web hosting needs ๐คฃ
Okay. The house is properly cleaned up. There are 77 users on this pod, 34 inactive and 12 active. That's a good effort I think. Maybe some of those folks that haven't been around for a while, but were pretty decent folks to talk to and interact with may come back. For example @off_grid_living ๐
@bender / @mckinley could you both please change your password immediately? I will also work on some other security hardening that I have a hunch about, but will not publicize for now.
๐ At some point over the next day or two I will be deleting the following feeds/accounts:
https://gist.mills.io/prologic/ae61ae2bfba6401e8955a33394fd858b
If anyone spots anything on this list that shouldn't be deleted, please let me know! ๐
The mobile autocomplete bug is something I can reproduce and likely fix soonโข -- I think its happenning because I accidentally nuked this pod's cache the other day (sorry!) ๐ข -- But it is also a bug ๐
FYI: I will be deleting the following 57 inactive (dead?) users on this pod today:
henseegeek fundor333 westbam onlyfansreview mabdalrahman retronav crunched deebs tca qwe234 pfefferle razetime kayos marguesto john yale slackjeff kodaira313a denisovich mlctrez jcrawford l3db3tt3r crunch homer mjy testdrive neoboard svendowideit palash k0rr stxh nirmal_kumar jan6 bram frankiem cvshumake qazsx apoorv10 duriny_test heyjude asepaned testest kevin natascha_e papz anvis spammer lonfas kamme dooven aatikakhan enochthec aman justinakers pc dai superyarn
If you wish to keep your account/feed, please login immediately. You have ~12 hours from this post (as I'll be out playing table-tennis ๐พ)
Big day, hell big weekend! Got Table Tennis ๐ tournament ๐๏ธ where I'm the team captain of a team of two young players aged 9 and 10 called Spin Kings ๐คฃ The competition looks really tough, I'm not really sure how we'll go to be honest, but we'll try out best ๐
Hmmm I'm a little concerned, as I'm seeing quite a few feeds I follow in an error state:
I'm not so concerned with the 15x context deadline exceeded
but more concerned with:
aelaraji@aelaraji.com Unfollow (6 twts, Last fetched 5m ago with error:
dead feed: 403 Forbidden
x4 times.)
And:
anth@a.9srv.net Unfollow (1 twts, Last fetched 5m ago with error:
Get "http://a.9srv.net/tw.txt": dial tcp 144.202.19.161:80: connect: connection refused
x3733 times.)
Hmmm, maybe the stats are a bit off? ๐ค
Bit tired myself folks. It's 00:00 here and I'm going to bed ๐
Cool! Our park has disappeared again this morning! ๐ฑ Also it was cold outside! ๐ฅถ
Oh I forgot again ๐คฆโโ๏ธ Last Saturday of the month, so if anyone's up for a friendly catch up over video tomorrow? Same time, same place ๐
Hmmm something happened last night at ~3am (AEST) that decrased traffic to my pod quite considerably... Hmmm? Anyone have any ideas? ๐ก
Anyway, I'm gonna have to go to bed... We'll continue this on the weekend. Still trying to hunt down some kind of suspected mult-GB avatar using @stigatle 's pod's cache:
$ (echo "URL Bytes"; sort -n -k 2 -r < avatars.txt | head) | column -t
URL Bytes
https://birkbak.neocities.org/avatar.jpg 667640
https://darch.neocities.org/avatar.png 652960
http://darch.dk/avatar.png 603210
https://social.naln1.ca/media/0c4f65a4be32ff3caf54efb60166a8c965cc6ac7c30a0efd1e51c307b087f47b.png 327947
...
But so far nothing much... Still running the search...
@abucci / @abucci Any interesting errors pop up in the server logs since the the flaw got fixed (unbounded receieveFile()
)? ๐ค
Hmmm ๐ง
for url in $(jq -r '.Twters[].avatar' cache.json | sed '/^$/d' | grep -v -E '(twtxt.net|anthony.buc.ci|yarn.stigatle.no|yarn.mills.io)' | sort -u); do echo "$url $(curl -I -s -o /dev/null -w '%header{content-length}' "$url")"; done
...
๐ Let's see... ๐ค
@stigatle / @abucci My current working theory is that there is an asshole out there that has a feed that both your pods are fetching with a multi-GB avatar URL advertised in their feed's preamble (metadata). I'd love for you both to review this PR, and once merged, re-roll your pods and dump your respective caches and share with me using https://gist.mills.io/
Hmm remove the cpu limits on this pod, not even sure why I had 'em set tbh, we decided at my day job that setting cpu limits on containers is a bit of a silly idea too. Anyway, pod should be much snappier now ๐
Thinking we need to adapt the UI a little bit to something like #json=2981299105862240,OhjMMMyAL5_oBREM3QAF5Q">this
I had a play with LiveKit Agents Playground: KITT and I have to say it's pretty impressive. Not the ChatGPT part of course, but the speech recognition and text to speech synthesis.
KITT is an AI voice assistant powered by LiveKit Agents, Deepgram, Eleven Labs, and ChatGPT. It is running on LiveKit Playground.
It's too bad it relies on three cloud services, none of which can be run locally (with the exception of Ollama that you could replace the OpenAI component with).
Are we over Crowdstrike yet? ๐ค๐คฃ Have We forgotten about it? ๐
Some bad code just broke a billion Windows machines - YouTube -- This is a really good accurate and comical take on what happened with this whole Crowdstrike global fuck up.
๐ฃ NEW: Added a new feature for pod operators to optionally configure. Compact Front Page.
When enabled will display only one post per feed on the unauthenticated Discover view (the front page).
@xuu Your pod is behaving much better now right? Any other issues aside from the Edit problem? ๐ค
@xuu I have a theory as to why your pod was misbehaving too. I think because of the way you were building it docker build
without any --build-arg VERSION=
or --build-arg COMMIT=
there was no version information in the built binary and bundled assets. Therefore cache busting would not work as expected. When introducing htmx and hyperscript to create a UI/UX SPA-like experience, this is when things fell apart a bit for you. I think....
There's a new interesting regression in yarnd
that's cropped up that results in a " />
at the end of uploaded/links images. I'm not able to figure this bug out yet ๐ข
I've been thinking about a new term I've come across whilst reading a book. It's called "Complexity Budget" and I think it has relevant in lots of difficult fields. I specifically think it has a lot of relevant in the Software Industry and organizations in this field. When doing further research on this concept, I was only able find talks on complexity budget in the context of medical care, especially phychiratistic care. In this talk it was describe as, complexity:
- Complexity is confusing
- Complexity is costly
- Complexity kills
When we think of "complexity" in terms of software and software development, we have a sort-of intuitive about this right? We know when software has become too complex. We know when an organization has grown in complexity, or even a system. So we have a good intuition of the concept already.
My question to y'all is; how can we concretely think about "Complexity Budget" and define it in terms that can be leveraged and used to control the complexity of software dns ystems?
Fixed Solar Panels for Camping ๐ Looks like good option for buying fixed solar panels, mounting brackets and other parts for mounting solar panels to your car's roof rack ๐
Finding the technical specifications of older vehicles, say >10 years is rally hard ๐คฆโโ๏ธ
I just blocked the following ASN(s) from being able to hit twtxt.net
or mills.io
:
16509 - AMAZON-02
32934 - FACEBOOK
Why? Because the Claude Bot web crawler from facebookexternalhit and Meta's facebookexternalhit web crawler are both behaving badly for pages that have no cache headers. Not sure if this is malicious, an oversight, a bug or me just being stupid and not ensuring every web resource or page had appropriate Cache
headers? ๐ค In any case, until I hear back from at least facebookexternalhit (whom I've reached out to), these ASN(s) will remain entirely blocked.
That is the entirety of Amazon Web Services and Facebook.
Can anyone recommend and/or vouch for a Chrome/browser extension that lets me write rewrite rules for arbitrary links on a page? e.g: s/(www\.)?youtube.com\/watch?v=([^?]+)/tubeproxy.mills.io/play/\1
for example? ๐ค
Another day, another web app built ๐ This time tubeproxy, which still needs some tidying up project-wise (bugger all docs, setup guide, etc), but so far it works quite nicely. If you're curious, you're welcome to try it out at https://tubeproxy.mills.io -- Although technically this meant for internal use (as I block Youtube at the network on purpose).
Additional features I'm thinking about next:
- Add to Plex (on-demand download, tag and update of the Plex archives)
- Subscribe (added to my
ytdl-sub
that subscribes to Youtube channels and stores nicely in Plex)
So Belong (out retail mobile phone provider of choice), who are owned by Telstra want to increase the price of their plans by +40%.
Telstra, who own Belong, have had the following financial earnings over the past 4 years:
- FY2021: NPAT (Net Profit After Tax): +3.4% $1.9B
- FY2022: NPAT (Net Profit After Tax): -4.6% $1.8B
- FY2023: NPAT (Net Profit After Tax): +13.1% $2.1B
- FY2024: NPAT (Net Profit After Tax): +11.4% $1B
Not sure how this year's results had a +11.4% increase, but only $1B in profits.
Telstra #Belong #Australia #PriceHikes
I recently learned that our Australian Liberal National Party, spent 10's of thousands of dollars on a campaign involving flyers posted around the suburbs (localities) of our local Greens federal member, Elizabeth Watson Brown. Not only was the material produced by the LNP party, distributed and paid for by the LNP party, full of lies, but they had the audacity to make the "flyers" appear as though they were from the Greens themselves! ๐คฆโโ๏ธ wtf?! #Politics #Sucks
In case of conflict, consider users over authors over implementors over specifiers over theoretical purity.
โ W3C, HTML Design Principles ยง 3.2 Priority of Constituencies
I can't believe I've been writing Go code for over 8 years already ๐ฎ
Just added support for deleting and editing arbitrary Twt(s) at anypoint in your timeline. Some things to note:
- I'm not really that happy with the code between
PostHandler()
andDeleteTwtHandler()
anymore ๐ข It really needs some major refactoring, and better tests. - This only works for users (for now), no support for Persona(s) / Feeds sorry.
One side-effect I've noticed (which was always the case); an Edited Twt whilst preserves the original timestamp, gets appended to your feed at the bottom of the file. This is counter intuitive when you think about editing text files with a text editor, but it does make sense in the way yarnd
treats feeds as append-only (I had just forgotten). I'm not doing anything about this though.
Oh boi! ๐คฆโโ๏ธ I totally forgot to put this notice up, and the month has flown by so quickly! ๐ Sorry folks! Hope it's not too late! โฐ for our monthly Yarn.social Online Meetup! ๐ค
-
Event: Yarn.social Online Meetup
-
When: 22nd June 2024 at 12:00pm UTC (midday)
-
Cadence: 4th Saturday of every Month
-
Agenda:
Anything we want to talk about. Twtxt, Yarn, self hosting, cool stuff youโve been working on. chit-chat, whatever ๐
Yarn.social #Meetup
Interview with Senior JS Developer 2024 [NEW] - YouTube Bahahahahaha ๐คฃ So funny!
Should I just code in a work-around? If the Referer
is /post
then consider that total bullshit, and ignore? ๐ค
Why would a Web Browser set the Referer
header incorrectly?! ๐ค
* 8eef4d5d - (HEAD -> main, origin/main) Add DB restore capability and tools/backu-_db.sh script (19 seconds ago) <James Mills>
Oh well ๐ It works wonderfully!
In the event of a database corruption or loss:
$ URL=http://10.0.0.164:8000 ./tools/backup_db.sh > db.json
mv db.json data/db.restore.json
yarnd ...
I'm finding myself more and more now using the web app on mobile ๐ค๐ง
@bender Btw how are you finding the new and improved UX at all? ๐ค (this Twt was authored by presing ^n
, then @
and then bender<TAB>
and the rest of this twt, finally ^ENTER
)
Hypermedia Systems -- Edit: I should have said, I plan to read this book...
Well I'm off to bed. I've fixed as many broken things as I could find. All in the name of improvements eh? ๐ Here's what I am aware of that's still non-functional/broken:
- Link Verification
- Stripping Tracking Params
Pretty much everything else should be working. If something isn't quite right though, please help me out with a concise repro ๐
I tried to fix some more bugs today, but who knows, I may have made things worse ๐คฃ