if you could pick a standard format for a purpose what would it be and why?
e.g. flac for lossless audio because…
(yes you can add new categories)
summary:
- photos .jxl
- open domain image data .exr
- videos .av1
- lossless audio .flac
- lossy audio .opus
- subtitles srt/ass
- fonts .otf
- container mkv (doesnt contain .jxl)
- plain text utf-8 (many also say markup but disagree on the implementation)
- documents .odt
- archive files (this one is causing a bloodbath so i picked randomly) .tar.zst
- configuration files toml
- typesetting typst
- interchange format .ora
- models .gltf / .glb
- daw session files .dawproject
- otdr measurement results .xml
Just going to leave this xkcd comic here.
Yes, you already know what it is.
One could say it is the standard comic for these kinds of discussions.
There are too many of these comics, I’ll make one to be the true comic response and unite all the different competing standards
🪛
Open Document Standard (.odt) for all documents. In all public institutions (it’s already a NATO standard for documents).
Because the Microsoft Word ones (.doc, .docx) are unusable outside the Microsoft Office ecosystem. I feel outraged every time I need to edit .docx file because it breaks the layout easily. And some older .doc files cannot even work with Microsoft Word.
Actually, IMHO, there should be some better alternative to .odt as well. Something more out of a declarative/scripted fashion like LaTeX but still WYSIWYG. LaTeX (and XeTeX, for my use cases) is too messy for me to work with, especially when a package is Byzantine. And it can be non-reproducible if I share/reuse the same document somewhere else.
Something has to be made with document files.
Markdown, asciidoc, restructuredtext are kinda like simple alternatives to LaTeX
There is also https://github.com/typst/typst/
It is unbelievable we do not have standard document format.
What’s messed up is that, technically, we do. Originally, OpenDocument was the ISO standard document format. But then, baffling everyone, Microsoft got the ISO to also have
.docx
as an ISO standard. So now we have 2 competing document standards, the second of which is simply worse.
I was too young to use it in any serious context, but I kinda dig how WordPerfect does formatting. It is hidden by default, but you can show them and manipulate them as needed.
It might already be a thing, but I am imagining a LaTeX-based standard for document formatting would do well with a WYSIWYG editor that would hide the complexity by default, but is available for those who need to manipulate it.
There are programs (LyX, TexMacs) that implement WYSIWYG for LaTeX, TexMacs is exceptionally good. I don’t know about the standards, though.
Another problem with LaTeX and most of the other document formats is that they are so bloated and depend on many other tasks that it is hardly possible to embed the tool into a larger document. That’s a bit of criticism for UNIX design philosophy, as well. And LaTeX code is especially hard to make portable.
There used to be a similar situation with PDFs, it was really hard to display a PDF embedded in application. Finally, Firefox pdf.js came in and solved that issue.
The only embedded and easy-to-implement standard that describes a ‘document’ is HTML, for now (with Javascript for scripting). Only that it’s not aware of page layout. If only there’s an extension standard that could make a HTML page into a document…
I was actually thinking of something like markdown or HTML forming the base of that standard. But it’s almost impossible (is it?) to do page layout with either of them.
But yeah! What I was thinking when I mentioned a LaTeX-based standard is to have a base set of “modules” (for a lack of a better term) that everyone should have and that would guarantee interoperability. That it’s possible to create a document with the exact layout one wants with just the base standard functionality. That things won’t be broken when opening up a document in a different editor.
There could be additional modules to facilitate things, but nothing like the 90’s proprietary IE tags. The way I’m imagining this is that the additional modules would work on the base modules, making things slightly easier but that they ultimately depend on the base functionality.
IDK, it’s really an idea that probably won’t work upon further investigation, but I just really like the idea of an open standard for documents based on LaTeX (kinda like how HTML has been for web pages), where you could work on it as a text file (with all the tags) if needed.
Bro, trying to give padding in Ms word, when you know… YOU KNOOOOW… they can convert to html. It drives me up the wall.
And don’t get me started on excel.
Kill em all, I say.
This is the kind of thing i think about all the time so i have a few.
- Archive files:
.tar.zst
- Produces better compression ratios than the DEFLATE compression algorithm (used by
.zip
andgzip
/.gz
) and does so faster. - By separating the jobs of archiving (
.tar
), compressing (.zst
), and (if you so choose) encrypting (.gpg
),.tar.zst
follows the Unix philosophy of “Make each program do one thing well.”. .tar.xz
is also very good and seems more popular (probably since it was released 6 years earlier in 2009), but, when tuned to it’s maximum compression level,.tar.zst
can achieve a compression ratio pretty close to LZMA (used by.tar.xz
and.7z
) and do it faster[1].zstd and xz trade blows in their compression ratio. Recompressing all packages to zstd with our options yields a total ~0.8% increase in package size on all of our packages combined, but the decompression time for all packages saw a ~1300% speedup.
- Produces better compression ratios than the DEFLATE compression algorithm (used by
- Image files:
JPEG XL
/.jxl
- “Why JPEG XL”
- Free and open format.
- Can handle lossy images, lossless images, images with transparency, images with layers, and animated images, giving it the potential of being a universal image format.
- Much better quality and compression efficiency than current lossy and lossless image formats (
.jpeg
,.png
,.gif
). - Produces much smaller files for lossless images than AVIF[2]
- Supports much larger resolutions than AVIF’s 9-megapixel limit (important for lossless images).
- Supports up to 24-bit color depth, much more than AVIF’s 12-bit color depth limit (which, to be fair, is probably good enough).
- Videos (Codec):
AV1
- Free and open format.
- Much more efficient than x264 (used by
.mp4
) and VP9[3].
- Documents:
OpenDocument / ODF / .odt
- @raubarno@lemmy.ml says it best here.
.odt
is simply a better standard than.docx
.
it’s already a NATO standard for documents Because the Microsoft Word ones (.doc, .docx) are unusable outside the Microsoft Office ecosystem. I feel outraged every time I need to edit .docx file because it breaks the layout easily. And some older .doc files cannot even work with Microsoft Word.
- @raubarno@lemmy.ml says it best here.
.tar
is pretty bad as it lacks in index, making it impossible to quickly seek around in the file. The compression on top adds another layer of complication. It might still work great as tape archiver, but for sending files around the Internet it is quite horrible. It’s really just getting dragged around for cargo cult reasons, not because it’s good at the job it is doing.In general I find the archive situation a little annoying, as archives are largely completely unnecessary, that’s what we have directories for. But directories don’t exist as far as HTML is concerned and only single files can be downloaded easily. So everything has to get packed and unpacked again, for absolutely no reason. It’s a job computers should handle transparently in the background, not an explicit user action.
Many file managers try to add support for
.zip
and allow you to go into them like it is a folder, but that abstraction is always quite leaky and never as smooth as it should be.- By separating the jobs of archiving (
.tar
), compressing (.zst
), and (if you so choose) encrypting (.gpg
),.tar.zst
follows the Unix philosophy of “Make each program do one thing well.”.
wait so does it do all of those things?
So there’s a tool called tar that creates an archive (a
.tar
file. Then theres a tool called zstd that can be used to compress files, including.tar
files, which then becomes a.tar.zst
file. And then you can encrypt your.tar.zst
file using a tool called gpg, which would leave you with an encrypted, compressed.tar.zst.gpg
archive.Now, most people aren’t doing everything in the terminal, so the process for most people would be pretty much the same as creating a ZIP archive.
- By separating the jobs of archiving (
By separating the jobs of archiving (.tar), compressing (.zst), and (if you so choose) encrypting (.gpg), .tar.zst follows the Unix philosophy of “Make each program do one thing well.”.
The problem here being that GnuPG does nothing really well.
Videos (Codec): AV1
- Much more efficient than x264 (used by .mp4) and VP9[3].
AV1 is also much younger than H264 (AV1 is a specification, x264 is an implementation), and only recently have software-encoders become somewhat viable; a more apt comparison would have been AV1 to HEVC, though the latter is also somewhat old nowadays but still a competitive codec. Unfortunately currently there aren’t many options to use AV1 in a very meaningful way; you can encode your own media with it, but that’s about it; you can stream to YouTube, but YouTube will recode to another codec.
The problem here being that GnuPG does nothing really well.
Could you elaborate? I’ve never had any issues with gpg before and curious what people are having issues with.
Unfortunately currently there aren’t many options to use AV1 in a very meaningful way; you can encode your own media with it, but that’s about it; you can stream to YouTube, but YouTube will recode to another codec.
AV1 has almost full browser support (iirc) and companies like YouTube, Netflix, and Meta have started moving over to AV1 from VP9 (since AV1 is the successor to VP9). But you’re right, it’s still working on adoption, but this is moreso just my dreamworld than it is a prediction for future standardization.
Could you elaborate? I’ve never had any issues with gpg before and curious what people are having issues with.
This article and the blog post linked within it summarize it very well.
deleted by creator
I get better compression ratio with xz than zstd, both at highest. When building an Ubuntu squashFS
Zstd is way faster though
wait im confusrd whats the differenc ebetween .tar.zst and .tar.xz
Different ways of compressing the initial
.tar
archive.deleted by creator
Sounds like a Windows problem
deleted by creator
I get the frustration, but Windows is the one that strayed from convention/standard.
Also, i should’ve asked this earlier, but doesn’t Windows also only look at the characters following the last dot in the filename when determining the file type? If so, then this should be fine for Windows, since there’s only one canonical file extension at a time, right?
deleted by creator
There already are conventional abbreviations: see Section 2.1. I doubt they will be better supported by tools though.
deleted by creator
In this case it really seems this windows convention is bad though. It is uninformative. And abbreviations mandate understanding more file extensions for no good reason. And I say this as primarily a windows user. Hiding file extensions was always a bad idea. It tries to make a simple reduced UI in a place where simple UI is not desirable. If you want a lean UI you should not be handling files directly in the first place.
Example.zip from the other comment is not a compressed .exe file, it’s a compressed archive containing the exe file and some metadata. Windows standard tools would be in real trouble trying to understand unarchived compressed files many programs might want to use for logging or other data dumps. And that means a lot of software use their own custom extensions that neither the system nor the user knows what to do with without the original software. Using standard system tools and conventions is generally preferable.
I would argue what windows does with the extensions is a bad idea. Why do you think engineers should do things in favour of these horrible decisions the most insecure OS is designed with?
Damn didn’t realize that JXL was such a big deal. That whole JPEG recompression actually seems pretty damn cool as well. There was some noise about GNOME starting to make use of JXL in their ecosystem too…
is av1 lossy
AV1 can do lossy video as well as lossless video.
- Archive files:
I don’t know what to pick, but something else than PDF for the task of transferring documents between multiple systems. And yes, I know, PDF has it’s strengths and there’s a reason why it’s so widely used, but it doesn’t mean I have to like it.
Additionally all proprietary formats, specially ones who have gained enough users so that they’re treated like a standard or requirement if you want to work with X.
oh it’s x, not x… i hate our timeline
I would be fine with PDFs exactly the same except Adobe doesn’t exist and neither does Acrobat.
I would be fine with PDFs exactly the same except Adobe doesn’t exist and neither does Acrobat.
Resume information. There have been several attempts, but none have become an accepted standard.
When I was a consultant, this was the one standard I longed for the most. A data file where I could put all of my information, and then filter and format it for each application. But ultimately, I wanted to be able to submit the information in a standardised format - without having to re-enter it endlessly into crappy web forms.
I think things have gotten better today, but at the cost of a reliance on a monopoly (LinkedIn). And I’m not still in that sort of job market. But I think that desire was so strong it’ll last me until I’m in my grave.
Literally any file format except PDF for documents that need to be edited. Fuck Adobe and fuck Acrobat
JPEG-XL for rasterized images.
I agree.
I especially love that it addresses the biggest pitfall of the typical “fancy new format does things better than the one we’re already using” transition, in that it’s specifically engineered to make migration easier, by allowing a lossless conversion from the dominant format.
Never heard of that, thanks for bringing it to my attention!
deleted by creator
GNOME introduced its support in version 45, AFAIK there isn’t a stable distro release yet that ships it.
Unfortunately, adoption has been slow and Alliance for Open Media are pushing back somewhat (especially Google[1], who leads the group) in favor of their inferior
.avif
format.
How does it compare to AVIF?
AVIF is slower, has a way smaller maximum resolution and doesn’t support progressive decoding as well as lossless JPEG recompression.
Oh dam, that resolution limit is a total deal breaker. Can’t believe anyone would release a format with those limitations today…
Data output from manufacturing equipment. Just pick a standard. JSON works. TOML / YAML if you need to write as you go. Stop creating your own format that’s 80% JSON anyways.
JSON is nicer for some things, and YAML is nicer for others. It’d be nice if more apps would let you use whichever you prefer. The data can be represented in either, so let me choose.
I don’t give a shit which debugging format any platform picks, but if they could each pick one that every emulator reads and every compiler emits, that’d be fucking great.
Even more simpler, I’d really like if we could just unify whether or not
$
is needed for variables, and pickor
//
for comments. I’m sick of breaking my brain when I flip between languages because of these stupid nuance inconsistencies.Don’t forget
;
is a comment in assembly.For extra fun, did you know
//
wasn’t standardized until C99? Comments in K&R C are all/* */
. Possibly the most tedious commending format ever devised./* */
is used in CSS as well, I think.Also we’ve got VB (and probably BASIC) out there using
'
because why not lol[EDIT] I stand corrected by another comment
REM
is what BASIC uses. DOS batch files use that, too. They’re old though, maybe we give them a pass “it’s okay grampa, let’s get you back to the museum” 🤣 (disclaimer: I am also old, don’t worry)
It does not work like that.
$
is required in shell languages because they have quoteless strings and need to be super concise when calling commands.and
//
are valid identifiers in many languages and all of them are well beyond the point of no return. My suggestion is to make use of your editor’s “turn this line into line comment” function and stop remembering them by yourself.
That just sounds impossible given the absolute breadth of all platforms and architectures though.
One per-thing is fine.
I wish there was a more standardized open format for documents. And more people and software should use markdown/.md because you just don’t need anything fancier for most types of documents.
Yes, but only if everyone adhere to CommonMark version of Markdown.
Nah, Pandoc Markdown is the true path.
why, what even is markdown
I agree, we need support for it in libreoffice and than other document editors.
We can not expect people to use codes, but editor that saves to it would be grat.
Standardized open format for documents might have been the only ISO meeting where people were protesting in the streets - https://en.wikipedia.org/wiki/Standardization_of_Office_Open_XML
So now ISO officially has two standard formats for the exact same thing!
Impressive! Thanks for sharing. I didn’t know there was a standard. So if someone sends me a docx I can ask them for an iso format now :)
I’d setup a working group to invent something new. Many of our current formats are stuck in the past, e.g. PDF or ODF are still emulating paper, even so everybody keeps reading them on a screen. What I want to see is a standard document format that is build for the modern day Internet, with editing and publishing in mind. HTML ain’t it, as that can’t handle editing well or long form documents, EPUB isn’t supported by browsers, Markdown lacks a lot of features, etc. And than you have things like Google Docs, which are Internet aware, editable, shareable, but also completely proprietary and lock you into the Google ecosystem.
XML for machine-readable data because I live to cause chaosEither markdown or Org for human-readable text-only documents. MS Office formats and the way they are handled have been a mess since the 2007 -x versions were introduced, and those and Open Document formats are way too bloated for when you only want to share a presentable text file.
While we’re at it, standardize the fucking markdown syntax! I still have nightmares about Reddit’s degenerate four-space-indent code blocks.
I’d like an update to the epub ebook format that leverages zstd compression and jpeg-xl. You’d see much better decompression performance (especially for very large books,) smaller file sizes and/or better image quality. I’ve been toying with the idea of implementing this as a .zpub book format and plugin for KOReader but haven’t written any code for it yet.
.gltf/.glb for models. It’s way less of a headache than .obj and .dae, while also being way more feature rich than either.
Either that or .blend, because some things other than blender already support it and it’d make my life so much easier.
JPEG XL for images because it compresses better than JPEG, PNG and WEBP most of the time.
XZ because it theoretically offers the highest compression ratio in most circumstances, and long decompression time isn’t really an issue when the alternative is downloading a larger file over a slow connection.
Config files stored as serialized data structures instead of in plain text. This speeds up read times and removes the possibility of syntax or type errors. Also, fuck JSON.
I wish there were a good format for typesetting. Docx is closed and inflexible. LaTeX is unreadable, inefficient to type and hard to learn due to the inconsistencies that arise from its reliance on third-party packages and its lack of guidelines for their design.
Typst for typesetting. Definitely underrated.
TeX / LaTex documentation is infuriating. It’s either “use your university’s package to make a document that looks like this:” -or- program in alien assembly language.
I like postscript for graphic design, but not so much for typesetting. For a flyer or poster, PS is great.