Sredzkistraße

  • Home
  • About
  • Academic
  • Code
9 Jan 2012

@grammer_man who the fuck is this nigga and why u comin at me like that #Hoeassnigga

Had a spare hour last Thursday and decided to write a little twitter bot. There he is above. His name is Grammer_Man and he corrects other twitter users’ misspellings, using data scraped from these Wikipedia pages.

Responses have been pouring in already, some agitated, some confused, but most positive — which was a pleasant surprise. In any event, the minimal amount of effort in coding has paid off many times over in entertainment.

You can see who’s responding at the moment by searching for @grammer_man, and also by checking his list of favourites.

Here is the (somewhat slapdash) code that powers our fearless spelling Nazi:

grabber.py

This module grabs the spelling data from Wikipedia.

Python
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
#!/usr/bin/env python
# -*- coding: utf-8 -*-
import os
import pickle
import requests
from BeautifulSoup import BeautifulSoup
def grab(letter):
'''
Grabs spellings from wikipedia
'''
url = 'http://en.wikipedia.org/wiki/Wikipedia:Lists_of_common_misspellings/%s' % letter
html = requests.get(url).content
soup = BeautifulSoup(html)
bullets = soup.findAll('li')
retval = {}
for bullet in bullets:
if 'plainlinks' in repr(bullet):
values = bullet.text.split('(')
if len(values) == 2:
retval[values[0]] = values[1][:-1] # shave off the ) at end
return retval
def get_spellings():
'''
Returns a dictionary of {false: correct} spellings
'''
if not os.path.exists('words.pkl'):
retval = {}
for c in 'ABCDEFGHIJKLMNOPQRSTUVWXYZ':
print 'Getting typos - %s' % c
retval.update(grab(c))
print 'Dumping...'
f = open('words.pkl', 'w')
pickle.dump(retval, f)
f.close()
return retval
else:
f = open('words.pkl', 'r')
retval = pickle.load(f)
f.close()
return retval
if __name__ == '__main__':
get_spellings()

bot.py

The bot. Selects misspellings at random, searches for them, responds to them, while also taking breaks between tweets and longer breaks every few hours.

Python
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
#!/usr/bin/env python
# -*- coding: utf-8 -*-
import os
import random
import time
import pickle
import twitter
from grabber import get_spellings
API = twitter.Api(consumer_key='XXX',
consumer_secret='XXX',
access_token_key='XXX',
access_token_secret='XXX')
MESSAGES = u'''
$USERNAME sooo you might wanna spell $CORRECT the right way next time!! Not your fault bro.
#
# All messages stored in here, one per line.
# Edited out in order to save space in this blog post.
#
'''.split('\n')
def compose_message(twitter_post, mistake, correct):
'''
Choose a message from MESSAGES at random, substitute fields to personalise it and
check if it exceeds the twitter message limit. Try this 100 times before failing.
'''
retries = 0
while retries < 100:
retries += 1
message = MESSAGES[random.randint(0, len(MESSAGES) - 1)]
message = message.replace('$USERNAME', '@%s' % twitter_post.user.screen_name)
message = message.replace('$MISTAKE', '"%s"' % mistake).replace('$CORRECT', '"%s"' % correct)
if message and len(message) < 141:
return message
return None
def correct_spelling(twitter_post, mistake, correct):
'''
Correct someone's spelling in a twitter_post
'''
print u'Correcting @%s for using %s...' %(twitter_post.user.screen_name,
mistake)
message = compose_message(twitter_post, mistake, correct)
if not message:
print u'All messages were too long... Aborting...'
return None
else:
API.PostUpdate(message, in_reply_to_status_id=twitter_post.id)
return True
def search(word):
'''
Search twitter for uses of a word, return one if it's been used recently.
Otherwise return None.
TODO: Add time awareness.
'''
print 'Searching for uses of %s...' % word
results = API.GetSearch(word)
if results:
for result in results:
if not check_if_done(result.id) and not result.user.screen_name == 'grammer_man' and word in result.text:
return result
return None
def check_if_done(id):
'''
Checks if a tweet has already been responded to
'''
if os.path.exists('done.pkl'):
f = open('done.pkl', 'r')
done = pickle.load(f)
f.close()
if id in done:
return True
return False
def update_done(id):
'''
Updates a list of tweets that've been replied to
'''
if os.path.exists('done.pkl'):
f = open('done.pkl', 'r')
done = pickle.load(f)
f.close()
else:
done = []
done.append(id)
f = open('done.pkl', 'w')
pickle.dump(done, f)
f.close()
def main():
'''
Main program flow
'''
words = get_spellings()
counter = 0
while True:
word = random.choice(words.keys())
post = search(word)
if counter > 100:
rand_time = random.randint(120*60, 240*60)
print 'Done %s tweets, sleeping for %s minutes' % (counter, rand_time/60)
time.sleep(rand_time)
counter = 0
# TODO: PROPERLY PRUNE THE MISTAKES/CORRECTIONS FROM WIKIPEDIA AND REMOVE THIS:
if not u',' in word + words[word] and not u';' in word + words[word]:
if post:
result = correct_spelling(post, word, words[word])
if result:
counter += 1
print '#%s Done' % counter
update_done(post.id)
time.sleep(random.randint(300,500))
if __name__ == '__main__':
main()

Grammer_Man uses the following libraries:

  • python-twitter (Be warned: no proxy support)
  • requests
  • BeautifulSoup

This entry was posted on Monday, January 9th, 2012 at 8:06 pm and is filed under Code, Computers, Funny, Idiots, Internet, Oddities. You can follow any responses to this entry through the RSS 2.0 feed. You can leave a response, or trackback from your own site.

13 Responses to “@grammer_man who the fuck is this nigga and why u comin at me like that #Hoeassnigga”

  1. avatar Sam says:
    January 9, 2012 at 9:59 pm

    This is some fine work Aengus – hilarious! I just wouldn’t want to be the poor guy whose picture you used… he’s accumulating a lot of hate without even knowing it! Haha

  2. avatar Kevin says:
    January 9, 2012 at 10:12 pm

    *grammar

  3. avatar aengus says:
    January 9, 2012 at 10:15 pm

    See https://twitter.com/#!/grammer_man/status/156029191148670976

  4. avatar tai says:
    January 10, 2012 at 12:26 am

    From @grammer_man: “funguses”, @icypop? Seriously? It’s “fungi [plural]“.

    The problem with relying on Wikipedia—funguses is perfectly acceptable. The same with cactuses and octopuses, though I think ‘octopodes’ is the best sounding accepted plural.

    From Oxford:

    fungus |ˈfʌŋgəs|
    noun ( pl. fungi |-gʌɪ, -(d)ʒʌɪ| or funguses )
    any of a group of unicellular, multicellular, or syncytial spore-producing organisms feeding on organic matter, including moulds, yeast, mushrooms, and toadstools.

  5. avatar Tim McNamara says:
    January 10, 2012 at 12:45 am

    Just a quick note, for future code the random.choice function is quite useful. You could replace MESSAGES[random.randint(0, len(MESSAGES) - 1)] with choice(MESSAGES).

  6. avatar pandres says:
    January 10, 2012 at 2:05 am

    Cool colorscheme, can you pass ir.

  7. avatar aengus says:
    January 10, 2012 at 2:24 am

    Very nice! A hundred times more elegant.

  8. avatar Nik says:
    January 10, 2012 at 4:30 am

    I don’t think you’re iterating your variable ‘retries’ – your code will get stuck in a loop when a word/username is so long that all of your possible messages exceed 141 chars

  9. avatar deadwisdom says:
    January 10, 2012 at 8:20 am

    Use “tweetstream” instead.

  10. avatar aengus says:
    January 10, 2012 at 10:13 am

    Thanks for catching that! I’ve updated the code. As I said above — the code is terribly slap-dash, and so liable to contain bugs :)

  11. avatar Blaise says:
    January 10, 2012 at 4:12 pm

    How many MESSAGES have you got?

  12. avatar Qazi says:
    January 11, 2012 at 10:33 am

    It gives the error ‘Could not authenticate you’. Any workarounds to this? P.S. I am a newbie to python stuff.

  13. avatar Can I Get an Amen? « garyalan.net says:
    January 18, 2012 at 9:50 pm

    [...] to do this in, so I did a little research. My two main sources ended up being @the_shrinkbot and @grammer_man (warning, not safe for work language on grammer_man’s [...]

Leave a Reply

Click here to cancel reply.

« The Chaos
Is this the worst piece of music ever made? »
  • In My Ears

    • Cover artwork for By Dawn
      By Dawn
      Tortoise
      11 hours and 6 minutes ago
    • Cover artwork for On The Chin
      On The Chin
      Tortoise
      11 hours and 11 minutes ago
    • Cover artwork for Dot/Eyes
      Dot/Eyes
      Tortoise
      11 hours and 15 minutes ago
    • Cover artwork for Stretch (You Are All Right)
      Stretch (You Are All Right)
      Tortoise
      11 hours and 28 minutes ago
    • Cover artwork for Crest
      Crest
      Tortoise
      11 hours and 38 minutes ago
  • CATEGORIES

    • America (10)
    • Art (49)
      • Architecture (11)
      • Design (8)
      • Photography (9)
    • Computers (18)
      • Code (8)
      • Computer Games (1)
      • Computer Science (5)
      • Cryptography (1)
      • Robotics (2)
    • Digital Rights (3)
    • Drink (1)
    • Film (7)
      • Animation (2)
      • Documentary (1)
      • Short Film (4)
    • Funny (15)
    • Gay Rights (3)
    • Germany (9)
      • Berlin (5)
      • German Language (3)
    • Guns (1)
    • History (1)
    • Idiots (11)
    • India (1)
    • Internet (24)
    • Ireland (9)
      • Irish Language (2)
      • The Troubles (1)
    • Israel / Palestine Conflict (2)
    • Media (12)
      • News (9)
      • TV (1)
    • Music (31)
      • Bad Music™ (4)
      • Downloads (3)
      • Electronic (8)
      • Experimental (6)
      • Free music (2)
      • Generative Music (1)
      • Jazz (1)
      • Live (4)
      • Music theory (2)
      • Videos (3)
    • Oddities (29)
    • Politics (29)
      • Censorship (5)
      • Far-right (1)
    • Religion (2)
    • Science (5)
    • Sports (1)
      • Football (1)
    • War (5)
    • Words (17)
      • Linguistics (7)
      • Literature (2)
      • Poetry (4)
  • Friends' Blogs

    • Jaded Isle
    • johnl.org
    • jonathan.beaton
    • Kay Doubleu
    • King Lud’s Revenge
    • Perte de Temps
  • My Other Websites

    • The Wisp Archive
  • META

    • Log in
    • Entries RSS
    • Comments RSS
    • WordPress.org
Avatars by Sterling Adventures
Creative Commons License
Sredzkistraße is proudly powered by WordPress
Design & code by Jonk, modified for Sredzkistraße by aengus.
Entries (RSS) and Comments (RSS).