This is essentially OCR (optical cuneiform recognition) which can have specific challenges because the signs are 3 dimensional depressions in clay and present differently depending on the lighting and shadows which can be all over the place.
The output is sort of standard rendering, it looks like they preserve the layout.
Chinese stele's are famously turned 2D through ink rubbing. Wonder if that'd simplify the task and at least provide a more consistent image to work off of (as compared to a randomly illuminated photo)
Or another intermediary would be to train a model to transform photos to rubbing and have a more standard representation to work off of before training to recognize characters
You don't have to do the rubbing physically - if you can 3d-scan the surface you could do it digitally. But, giving the AI less information to work with sounds like a step backwards.
rubbing are much easier to do. While I'm guessing transcriptions are not. So you could provide a lot of training data.
You could then train one model on the rubbing->text and then each research center could make their own photo->rubbing model for their particular photography setup
So ideally the ml network would solve the problem end to end but these authors seem to be using a network for only one step of their otherwise classic image processing pipeline
Nice, now can any cuneiform nerds that have an interest in Ramses III guide me to any quote that roughly translates to "violence is necessary against Isfet". For the life of me, I cannot seem to find any reference to it although I've got it in my notes from... somewhere?
Grok 3 says "For instance, in the Poem of Pentaur, a propagandistic account of the Battle of Kadesh, Ramses II is depicted as a heroic figure restoring order against the chaotic Hittite forces. While it doesn’t say “violence is necessary against Isfet” verbatim, the subtext is that decisive action, including violence, was justified to crush disorder and uphold Ma'at."
(It had earlier defined Ma'at as "order, balance, and justice")
Now that archeologists are more productive with AI they can fire 20% of the staff over hired during covid, and give massive bonuses to the executives running museums and archeology departments.
Archeologists who stay can expect smaller stock grants, too.
Now get back to work before Elon gets here and demands 60+ hr workweeks and requires all 401k plans to be converted to DOGEcoin.
Because a man who hypes Social Security "as a ponzi scheme" knows that crypto is a good business proposition.
Translation of cuneiform is limited by the number of people who know the different Semitic and nonsemitic languages. Bulk transcription of the texts just makes the amount of material available to translate even more overwhelming.
The real revolution would be AI reconstruction of the 3d fragments and filtering of the bulk documents (receipts) from the interesting historical augments. (Hate mail, spam, poetry, death threats, omens, math, etc.)
Apart from the fact that Ponzi was a fraud, while there is no pretence that Social Security schemes actually pay their way - the deductions you pay are better off being represented as a membership fee for the scheme.
This is essentially OCR (optical cuneiform recognition) which can have specific challenges because the signs are 3 dimensional depressions in clay and present differently depending on the lighting and shadows which can be all over the place. The output is sort of standard rendering, it looks like they preserve the layout.
Direct link to paper which has good illustrations: https://tau-vailab.github.io/ProtoSnap/
Chinese stele's are famously turned 2D through ink rubbing. Wonder if that'd simplify the task and at least provide a more consistent image to work off of (as compared to a randomly illuminated photo)
Or another intermediary would be to train a model to transform photos to rubbing and have a more standard representation to work off of before training to recognize characters
The material of most inscriptions is dried clay: rubbing isn't possible without damage AFAIK.
You don't have to do the rubbing physically - if you can 3d-scan the surface you could do it digitally. But, giving the AI less information to work with sounds like a step backwards.
rubbing are much easier to do. While I'm guessing transcriptions are not. So you could provide a lot of training data.
You could then train one model on the rubbing->text and then each research center could make their own photo->rubbing model for their particular photography setup
Very interesting approach and perhaps a good step towards transliteration and translation of cuneiform tablets. Code and links for paper at https://github.com/TAU-VAILab/ProtoSnap/tree/main
It looks very adhoc
So ideally the ml network would solve the problem end to end but these authors seem to be using a network for only one step of their otherwise classic image processing pipeline
What a smear.
A lot of applied work in vision and audio glues together different existing modules, instead of training the whole thing end-to-end.
In an ideal world, things are ideal. But the world isn't a grad student's wet dream.
But there are not as many influential papers in the world as there are people who respond: "Why didn't they train the whole thing end-to-end?"
"Ideally they'd solve the whole problem, but it looks like they just solved one part of the problem"
Could you fine-tune a VLM that already does OCR well on cuneiform?
Nice, now can any cuneiform nerds that have an interest in Ramses III guide me to any quote that roughly translates to "violence is necessary against Isfet". For the life of me, I cannot seem to find any reference to it although I've got it in my notes from... somewhere?
Grok 3 says "For instance, in the Poem of Pentaur, a propagandistic account of the Battle of Kadesh, Ramses II is depicted as a heroic figure restoring order against the chaotic Hittite forces. While it doesn’t say “violence is necessary against Isfet” verbatim, the subtext is that decisive action, including violence, was justified to crush disorder and uphold Ma'at."
(It had earlier defined Ma'at as "order, balance, and justice")
Awesome.
Now that archeologists are more productive with AI they can fire 20% of the staff over hired during covid, and give massive bonuses to the executives running museums and archeology departments.
Archeologists who stay can expect smaller stock grants, too.
Now get back to work before Elon gets here and demands 60+ hr workweeks and requires all 401k plans to be converted to DOGEcoin.
Because a man who hypes Social Security "as a ponzi scheme" knows that crypto is a good business proposition.
Translation of cuneiform is limited by the number of people who know the different Semitic and nonsemitic languages. Bulk transcription of the texts just makes the amount of material available to translate even more overwhelming.
The real revolution would be AI reconstruction of the 3d fragments and filtering of the bulk documents (receipts) from the interesting historical augments. (Hate mail, spam, poetry, death threats, omens, math, etc.)
The real revolution would be an end to end machine transliteration to translation pipeline.
…Which will require lots of sample data for all the steps.
Humans learn these languages from tiny samples+structured practice and dictionaries. Research required but AI could be similarly low info.
You should take a month break from reading twitter for your own health
Not the place for that.
'Because a man who hypes Social Security "as a ponzi scheme" '
The only thing that differentiates Social Security from a typical Ponzi scheme is that a typical Ponzi scheme is now illegal.
Neither Boomers, Gen x, nor gen y, are paying enough contributions to meet their demands and are instead relying on contributions from future donors.
That is indistinguishable from what Ponzi did.
Apart from the fact that Ponzi was a fraud, while there is no pretence that Social Security schemes actually pay their way - the deductions you pay are better off being represented as a membership fee for the scheme.