Overheard


listening to two undergrads discuss Twitter & semantics over coffee and a sandwich at Bat 17
A: You gotta stop putting a hash in front of every single word on Twitter, dude.
B: Oh, don't sweat it, I have a bash script that does it automatically.
A: That's not what I'm... Wait. Do you use a command-line Twitter client?
B: Yeah, doesn't everybody?
A: Is that why you never use double quotes? Because you'd have to escape them?
B: ... Yeah, maybe.
A: REGARDLESS. Stop putting hashes everywhere, with the syntax highlighting it looks terrible. You know what hashtags are, right?
B: I do, and I hate them! I hope my hashspam will demonstrate how limited they are as a signifier!
A: What?
B: Their limited expressability is an embarassment for the Twitterverse. 10 millions tweeters, and the most they can come up with is a hack like hashtags?
A: You understand you only get a 140 characters right? And the hashtags serve as shortcuts for context that doesn't fit the character limit?
B: Yes! I do understand! That's why I have submitted my HashTag spec to IEEE! It takes the brevity of hashtags and adds DEEPER SEMANTICS.
A: Like what?
B: Well, for example. What if your post is only tangentially related to It's Always Sunny in Philadelphia? Currently, you are restricted to a BINARY CLASSIFICATION: either you include #sunny or you don't! With my extensions, you would instead include the simple HASH(tagname="sunny-tv-show"|value="0.5"). This RICHER SEMANTICS allows for DEEPER EXPRESSITIVITY.
A: But
B: Negation is also allowed! So in a typical use-case where you refer to the characters in It's Always Sunny in Philadelphia, but not the show proper, you could do HASH(tagname="sunny-tv-show"|value="-1.0"). This would allow current followers or future SEMANTICALLY-AWARE FOLLOW-AGENTS to correctly place your tweet in their HIERARCHICAL INTERNAL DATA STRUCTURES.
A: Who
B: My HashTag format is both HUMAN and MACHINE-READABLE, and is FORWARD-COMPATIBLE. For example, I am working on an extension that permits NON-UTF8 HASHTAGS ATTRIBUTES to pave the way for FUTURE INTERNATIONALIZATION. This extension works just as you'd expect: you first specify the encoding using the reserved ENC tag with curly braces, then the tag itself, like ENC{UTF8|HASH(tagname="sunny-tv-show"|value=".2832")}.
A: Why in the world would someone tag their own tweet with a hash value of .2832? Why would they possibly use so many significant digits?
B: Well, humans may not, but future SMART TWITTER CLIENTS could add these kinds of tags and values automatically. The standard specifies that all values are represented internally as 64-BIT DOUBLES so precision isn't a concern.
A: How would you actually use this?
B: Oh, I don't know. Something like "Did anyone else see last night's Sunny? Rowdy Roddy Piper was awesome, lol! ENC{UTF8|HASH(tagname="rowdy-roddy-piper-wrestler"|value=".75")|HASH(tagname="sunny-tv-show"|value=".6")|HASH(tagname="sun-astronomical-body"|value="-1.0")|HASH(tagname="sunny-song-bobby-hebb"|value="-1.0")}"
A: Oh, thanks for clarifying which "Sunny" you were referring to. I was genuinely confused whether it was the TV show we talk about all the time, the Sun in the sky, or the song by some guy name Bobby Hebb that I've never ever heard of.
B: Don't be sarcastic. I admit that these particular tags may be more useful to FUTURE SEMANTICALLY-AWARE
A: Yeah, I get it. It was also way more than 140 characters. Way way more.
B: Was it? Wait, did I specify the encoding?
A: Yeah.
B: Ok, well that wasn't technically necessary, because the specification assumes UTF8 unless it's explicitly set to something else. So if I drop the ENC, that saves like five characters, do you think it's still over 140?
A: I don't know, dude. But yeah, yeah. I'm sure it's still way over. In a whole lot of ways.