RWET Final Proposal

Real news for a fictional world:

The idea I have for the final is to create a sort of news service for fictional worlds. My thinking right now is that I’ll work on something that reads in current headlines, picks one (or maybe uses all of them) to construct a headline for a fictional (e.g. TV or video game) universes. My plan is to use spaCy to pick out noun phrases and other parts of speech and then replace them with similar parts of speech from Wikia fandom pages. This may include descriptions of the articles as well. Ideally this will create a natural contrast with the idea narrative of “fake news”. Here are some examples done by hand:

 

I’m still trying to figure out how the presentation of this will work, maybe I will read it like a newspaper. I’m not sure if it should come from one fandom or multiple.

Read More

RWET – Markov / Ngram

This week I took my headline gathering script and applied the Ngram and Markov chain library to it. After adjusting the length of the N grams and trying out some “key” words I ended up with something rather simple but effective in terms of this confusion of news and time effect I’m going for. Basically I take all of my headlines and pass them through Generate_from_tokens function to create new headlines with an Ngram length of 6. I liked the results here because the headlines are typically generated but occasionally repeating the source material so when you read through a list of them its unclear whats real and what isn’t. Also some of them worked out to be kinda funny. Here are some of the highlights:

Hey, normals, Martha Stewart wants to Nerve Gas in Syrian man says

The Army could go  boring, bad, and really bad

Pepsi pull offensive Kendall Jenner ad after he felt like a ‘clown’

Worlds largest canary discovers Weird, Unexplain benefits sacrificed to cut deficit

Syria and Assad Altered After Sexual Claims Against Beijing: Dalai Lama

Demand For Seal Products From That Viral Second Grade Survey Are Finally Here

Samsung Galaxy S8, Samsung Galaxy Note 8 releases Windows 10

US threatens history

Angry Mets fan tells Alex Rodriguez how he used to revel in his Syrian chemical Attack

‘Other Fish to Fry’: Despite Screaming Person

How Target Botched Its Response to Syria

NBA draft lottery odds: Where have all the MLB superstars gone?

‘Designated Survivor’ Recap: Cato Hosts a Fundraiser With Frederick Douglass

Trump fumbles, Putin as a gay clown

Inside of Job-Killing $1 Billionaires who run Trump’s Israel-Palestine efforts

Rogue Ones best phone ever

Pepsi Pulls Ad Accused of causing index funds, Leon Cooperman is now illegal in Russia connection

RIP Facebook will replace it.

Catastrophic water on growing Uber (or Yahoo or any of it)

Donald Trump Warns Syria, North Korea nuke program, FOI investigation red lines and Kendall Jenner

Bannon Removes Stephen Bannon from Chrome and John Cena had to account?

Controversial Trump Removes Steve Ballmer, Reid Hoffman and NSC: Huge Victory in south Wales game

Toddler attack, US warns of unilateral Syrian gas attack’: Trump Campaigned for break

IPL 2017, Match 1: Yuvraj’s blitz proves to Ban Jehovahs Witnesses as it works to overhaul its business

Exclusive–Sen. Ron Johnson suffers fashion dominant win for the Astros, the reverse

Theranos Found On Shipwreck

What to expect from National Security Council

White Housewives of India

Analysis: For Trump: ‘I now have broken the Oval Office

19 Restaurants Where You Might Just Take A Minute To Appreciate Old Pugs

NBA Mock Draft 4.0: New top-10 picks, new language of gig economy

Trickier-than-usual conditions on the Probes Into Russia’s ‘fanciful’ explanation for Space Project

Rapid rise of clothes moths threatens ‘our own action’ in Syrian gas massacre

Chemical attacked by malicious Wi-Fi networks

Woman Who Degrades Women

haley threatened to enjoy the next year

woman who helped launched an ancient land and sea missiles

dealmaster: today only,  hydrogen cars from national security council

van jones: trump remarks on news coverage of gig economy

smile if you think robots can be fatally hacked by pitbull-cross in playground mock draft day

what we do know about that viral security council

youtube tv goes `beyond red lines’

‘he’s taking my place’: trump should be terrified

plan a netflix marathon and we’ll guess what happened

spotify’s new deal with frederick douglass

haley threatens historic fabrics

The exact algorithm changes a bit here but the result is generally the same. I tried adding additional parameters to the markov function and other things but the more complicated I made the process the less interesting the results were to me. I was thinking it would be cool to write out a sentence with the key words and have generated sentences follow them or continue my time and date thing but the key seemed like it needed to be a specific length and I couldn’t figure out how to adjust for that.

Code:

Read More

RWET – Functions

This week I took my previous assignment and tried to get all the printing to work through a function. I got it all working in the end but I had a surprising amount of difficulty.

I call the function like this:

 

Here’s an example of the result, which hasn’t changed much.

 

Ashwin’s dig: March 30 will be known as ‘World Apology Day’ – Times of India

thirtieth

Both Houses adjourned till 11 am on March 31

Does Science Advance One Funeral at a Time?

 

Both Houses adjourned till 11 am on March 31

thirtieth

Supreme Court refers triple talaq case to 5-judge constitutional bench; hearing to begin May 11 – Times of India

No stamping of flyers’ handbags at 7 airports from April 1 – Times of India

I tried to set this aside as a module but It required a lot of libraries and variables from my main script. I guess I could have passed all those through as parameters but it felt kinda hacky so I left it all as one file.

Read More

RWET – Poetic Form

I  once again continued working with news headlines this week. However this time I think I have a more robust collection of headlines. I found a news api that lets me pull in over 550+ headlines from 60 different news sources. The code I wrote basically gets the list of sources from the API and then uses that list to request 10 headlines from each source. These headlines are then saved in a list together and then if any of them contain numbers (in either written or numerical form) they are sorted into another list based on the numbers they use. E.g “Five people discover volcano”   would be assigned to a list within the 5th slot of another list. I then got the real time and date and tried to have the program print the date and time using these headlines. Here are some results:

2:10:

Alan Shearer Names The Two Strikers Who Can Break His Premier League Goal Record | SPORTbible: This startup wants to send electric planes from London to Paris within 10 years

March 23, 2:39:

The best laptop deals in March 2017: cheap laptops for every budget, twenty-third,

Less than half of women breastfeed after two months, survey finds: thirty-nine

March 23, 3:01:

Obituary: Martin McGuinness died on March 21st, twenty-third

Germany 1-0 England: Three Lions stars rated and slated: Germany 1 England 0: Lukas Podolski bows out with signature thunderbolt to light up final match

3:02:

Police officer, three others killed in Wisconsin shooting: reports: Less than half of women breastfeed after two months, survey finds

3:03:

South Korean ferry in which hundreds died raised after three years: Police Officer, 3 Others Killed in a Shooting in Wisconsin

3:04:

Anti-terror police arrest three in Birmingham after Westminster attack: U.K. Parliament Attacker Leaves Four Dead, Including Police Officer

3:05:

Germany 1-0 England: Three Lions stars rated and slated: Terror Attack Near British Parliament Leaves 5 Dead

[More updates coming]

The last two I found particularly interesting because the headlines refer to the same event with different numbers. In this case it seems like the number of fatalities increased and the second was an updated headline. This is essentially what I was interested in by making a time based form that deals with news. I wanted to create something that relays time in terms of world events. This update underscores both how prevalent this event was in the news of that time but also the nature of the event itself. The juxtaposition of the two headlines about the London attacks with different death tolls actually makes the event seem realer, to me at least. As if that extra person passed away while you were reading about it.

The program also just writes the value of time if it can’t find. I thought this would create the effect of having “unburdened” numbers – or numbers that haven’t been implicated in any major news that day. Refreshing this list over and over shows how certain numbers become widely reported based on the days events while others don’t. I still haven’t nailed this news thing down to where I want it exactly (I think part of the problem is that it takes to long to read through a 24 hour clock) but I think this is the closest I’ve gotten so far in terms of making something interesting.

 

 

Read More

RWET – APIs

I continued to work with headlines and news. This time I used an API to grab the most popular headlines on NYTimes.com and pick a random word to replace with a definition from Oxford dictionary. Sometimes the results don’t make sense but every now an then an interesting replacement occurs. Here are some examples of the output (my favorites are in bold):

Old Headlines:

A Quiet Giant of Investing Weighs In on Trump

Mexico City, Parched and Sinking, Faces a Water Crisis

You May Want to Marry My Husband

Steve Bannon Carries Battles to Another Influential Hub: The Vatican

Lessons on Aging Well, From a 105-Year-Old Cyclist

What a Failed Trump Administration Looks Like

Why Nobody Cares the President Is Lying

A Crack in an Antarctic Ice Shelf Grew 17 Miles in the Last Two Months

Lower Back Ache? Be Active and Wait It Out, New Guidelines Say

36 Hours in San Diego

How New York City Gets Its Electricity

7 Earth-Size Planets Orbit Dwarf Star, NASA and European Astronomers Say

Trump Campaign Aides Had Repeated Contacts With Russian Intelligence

Mental Health Professionals Warn About Trump

Ignorance Is Strength

Keep or Replace Obamacare? It Might Be Up to the States.

How Uber Deceives the Authorities Worldwide

New Headlines:

A Quiet Giant of Investing Weighs In on a trumpet or a trumpet blast

Mexico City, Parched a Boolean operator which gives the value one if and only if all the operands are one, and otherwise has a value of zero Sinking, Faces a Water Crisis

used to refer to the person or people that the speaker is addressing May Want to Marry My Husband

Steve Bannon Carries Battles to Another having great influence on someone or something Hub: The Vatican

Lessons on Aging Well, From used when mentioning someone or something for the first time in a text or conversation 105-Yeused when mentioning someone or something for the first time in a text or conversationr-Old Cyclist

What a Failed Trump the process or activity of running a business, organization, etc. Looks Like

a reason or explanation Nobody Cares the President Is Lying

A very good or skilful in an Antarctic Ice Shelf Grew 17 Miles in the Last Two Months

Lower Back Ache? Be engaging or ready to engage in physically energetic pursuits and Wait It Out, New Guidelines Say

36 Hours in relating to the San or their languages Diego

How New York a large town Gets Its Electricity

7 Earth-Size Planets Orbit Dwarf Star, NASA and relating to or characteristic of Europe or its inhabitants Astronomers Say

Trump Campaign Aides Had Repeated Contacts accompanied by (another person or thing) Russian Intelligence

Mental Health Professionals Warn on the subject of; concerning Trump

Ignorance Iceland (international vehicle registration) Strength

Keep or Replace Obamacare? It Might Be Up used with the base form of a verb to indicate that the verb is in the infinitive, in particular the States.

How Uber Deceives denoting one or more people or things already mentioned or assumed to be common knowledge Authorities Worldwide

If I run it again, It will often yield different results with the same headlines:

 

New Headlines:

A making little or no noise Giant of Investing Weighs In on Trump

Mexico City, Parched and Sinking, Faces a pour or sprinkle water over (a plant or area) in order to encourage plant growth Crisis

You May Want to join in marriage My Husband

What a Failed Trump the process or activity of running a business, organization, etc. Looks Like

Why Nobody Cares the the elected head of a republican state Is Lying

A Crack in the form of the indefinite article (see a) used before words beginning with a vowel sound Antarctic Ice Shelf Grew 17 Miles in the Last Two Months

in what way or manner; by what means New York City Gets Its Electricity

36 Hours in a borough of New York City, at the south-western corner of Long Island. The Brooklyn Bridge (1869\u201383) links Long Island with lower Manhattan

7 Earth-Size Planets Orbit cause to seem small or insignificant in comparison Star, NASA and European Astronomers Say

Trump Campaign Aides Had Repeated Contacts With Russian the ability to acquire and apply knowledge and skills

Mental Health Professionals inform someone in advance of a possible danger, problem, or other unpleasant situation About Trump

lack of knowledge or information Is Strength

Keep or Replace Obamacare? used to refer to a thing previously mentioned or easily identified Might Be Up to the States.

How Uber Deceives denoting one or more people or things already mentioned or assumed to be common knowledge Authorities Worldwide

 

Read More

RWET – Cut up 2: Sets and dictionaries

This week I continued to work with news and tried to make some lightly cut up headlines. What I ended up with is a script that reads in a whole article, splits it into headlines and body (kind of hacky), finds words unique to each article and then randomly replaces words in the headlines with these unique words. Here are some results:

I had to run this a bunch of times to get good results but I was happy with the way it created this news blending effect. I could see working more on distorting news articles.

The code is a bit sloppy this time but I was able to to make some light usage of dictionaries and sets. I found myself relying on lists a lot out of familiarity but could begin to see how other structures would be useful.

 

Read More

RWET – Loops and Lists

This week in Reading and Writing Electronic Text I worked with loops and lists to make a script that breaks a series of headlines up into a sort of 24 hour clock. The idea is that each headline its slowly revealed over the course of a day as a play on the idea of the 24 hour news cycle.

I started with a text file that contained headlines from a bunch of large news outlets:

PROBLEM OF THEIR OWN Canada sees spike in border crossings from US
Rex Tillerson Arrives in Mexico Facing Twin Threats to Relations
Kim Jong-nam killing: North Korea condemns Malaysia
ACLU Sues Milwaukee Over Police Stop-and-Frisk Policy
Trump Meets With Corporate CEOs Thursday on Economic Policies
Bannon told EU it was flawed just before Pence’s visit
7 potentially habitable exoplanets discovered
2 cops charged in Florida woman’s accidental shooting death
Obama-linked activists have a ‘training manual’ for protesting Trump

I then shuffled the order of the headlines (to keep the output interesting) and broke each line into a list of words. The number of words in each line was used to figure out how to display it over the course of a day and then each word is designated a time slot. I created a loop to simulate a 24 hour cycle with this method. I imagine this could be a part of an installation that would act live a very slow moving new ticker.

The code:

import sys
import random
all_lines = []
minutes = -60
lines = []

for l in sys.stdin:
lines.append(l.strip())

random.shuffle(lines)

for line in lines:
line = line.strip()
word = line.split(” “)
all_lines.append(word)

for x in range(24):
wordTimerForEachHeadline = []
currentWordNum=[]
minutes+=60
print “\n”,
print “Today’s Headlines”,
print str(minutes/60)+(“:00”)
for i in range(len(all_lines)):
wordTimerForEachHeadline.append(1440/(len(all_lines[i])))
currentWordNum.append(minutes / wordTimerForEachHeadline[i])
print all_lines[i][currentWordNum[i]]

The output:

Today’s Headlines 0:00

Trump

Obama-linked

7

ACLU

Rex

PROBLEM

Bannon

Kim

2

Today’s Headlines 1:00

Trump

Obama-linked

7

ACLU

Rex

PROBLEM

Bannon

Kim

2

Today’s Headlines 2:00

Trump

Obama-linked

7

ACLU

Rex

OF

Bannon

Kim

2

Today’s Headlines 3:00

Meets

activists

7

ACLU

Tillerson

OF

told

Kim

cops

Today’s Headlines 4:00

Meets

activists

7

Sues

Tillerson

THEIR

told

Jong-nam

cops

Today’s Headlines 5:00

Meets

activists

potentially

Sues

Arrives

THEIR

EU

Jong-nam

cops

Today’s Headlines 6:00

With

have

potentially

Sues

Arrives

OWN

EU

Jong-nam

charged

Today’s Headlines 7:00

With

have

potentially

Milwaukee

Arrives

OWN

EU

killing:

charged

Today’s Headlines 8:00

Corporate

a

potentially

Milwaukee

in

Canada

it

killing:

in

Today’s Headlines 9:00

Corporate

a

potentially

Milwaukee

in

Canada

it

killing:

in

Today’s Headlines 10:00

Corporate

a

habitable

Milwaukee

Mexico

sees

was

killing:

in

Today’s Headlines 11:00

CEOs

‘training

habitable

Over

Mexico

sees

was

North

Florida

Today’s Headlines 12:00

CEOs

‘training

habitable

Over

Facing

spike

flawed

North

Florida

Today’s Headlines 13:00

CEOs

‘training

habitable

Over

Facing

spike

flawed

North

Florida

Today’s Headlines 14:00

Thursday

manual’

habitable

Police

Facing

in

flawed

Korea

woman’s

Today’s Headlines 15:00

Thursday

manual’

exoplanets

Police

Twin

in

just

Korea

woman’s

Today’s Headlines 16:00

on

for

exoplanets

Police

Twin

border

just

Korea

accidental

Today’s Headlines 17:00

on

for

exoplanets

Police

Threats

border

before

Korea

accidental

Today’s Headlines 18:00

on

for

exoplanets

Stop-and-Frisk

Threats

crossings

before

condemns

accidental

Today’s Headlines 19:00

Economic

protesting

exoplanets

Stop-and-Frisk

Threats

crossings

before

condemns

shooting

Today’s Headlines 20:00

Economic

protesting

discovered

Stop-and-Frisk

to

from

Pence’s

condemns

shooting

Today’s Headlines 21:00

Economic

protesting

discovered

Policy

to

from

Pence’s

Malaysia

shooting

Today’s Headlines 22:00

Policies

Trump

discovered

Policy

Relations

US

visit

Malaysia

death

Today’s Headlines 23:00

Policies

Trump

discovered

Policy

Relations

US

visit

Malaysia

death

 

 

Read More

Reading and Writing – Now with python!

This week we started out with python. We were asked to make a UNIX command-like python script for text manipulation. In the previous assignment I struggled with line length and that kind of came up again. I played around with the .replace() method and escape characters. I thought that by inserting “\n” into a line it would then be considered a new line in the rest of the script but it did not seem to work that way. Then I decided to try to remove a specific list of words (in this case conjunctions) and I was able to get that working in a for-loop. Eventually I did something similar to the two columns of words I made last week.

This time I chose a Federal Reserve statement on interest rates and a CNN news release about that release as the two source texts. I was able to go through each article and pick out all of the capitalized words over 3 characters in length and print them as a list. Then I added a randomized conjunction to each and created two columns with mostly proper nouns. The output reads fairly well and does seem to juxtapose the institutional and authoritative way the fed speaks with the more speculative and popular language of the CNN article.

Here’s the python code:

Here’s the output:

Information and Federal so

Federal nor Reserve and

Open so Trump so

Market so America’s for

Committee so Wednesday or

December nor President or

Household nor Trump’s or

Measures nor Trump’s for

Inflation so U.S. so

Committee’s yet That but

Market-based but Faster so

Consistent so Right but

Committee or Americans but

Committee for Trump and

Near-term for Republicans nor

Committee or Fed’s but

Committee and Trump nor

Committee so Mexican but

This nor Trade nor

Committee so Mexico so

Committee or House for

However, and Republicans nor

Committee for Supporters but

Treasury nor Some yet

This but Trump’s yet

Committee’s so Congress. nor

Voting for Michael or

FOMC or Arone, for

Janet nor State for

Yellen, but Street so

Chair; but Global but

William for Advisors. nor

Dudley, and Trump’s yet

Vice or America’s yet

Chairman; yet Francisco and

Lael but President or

Brainard; but John and

Charles but Williams but

Evans; or Financial but

Stanley so Times and

Fischer; or Jan. for

Patrick yet March or

Harker; yet Investors or

Robert for March, but

Kaplan; yet Group. or

Neel for

Kashkari; for

Jerome for

Powell; and

Daniel or

Tarullo. nor

Read More

RWET – Terminal Assignment

This week for Reading and Writing electronic set we were asked to do something creative with command line text manipulation. As a source material I started working with two thesis papers I wrote in college. One was for a Transportation Geography class that dealt with the impact of mobile phones on transit preferences and the other was for a class on Public Finance where I had written about GPS (GNSS) systems and the economic structure behind them. I’ve done a lot of work related to transportation and to a certain extend behavioral economics while at ITP so I thought going back and working with these half remembered papers would be interesting.

First I converted the files from their “.Docx” format to “.txt” to use with terminal. This worked but resulted in really long lines. Apparently in this instance the lines were broken where there had been paragraph breaks. I used the cut command to try and break up these lines into words or even sentences but had a hard time with that. Eventually I used Fold to force the paragraphs into lines 80 units long. Fold had broken the paragraphs at odd points so there were fractions of words at the beginning and end of each line. I kept playing around with the cut command and figured out that I could take the second word off of each line and come up with a list of full words that were pseudo randomly selected from each paper. I though comparing a random selection of words from each paper could be interesting to see how they differ. I guess I was curious if the topic or my writing style would stand out under this comparison. I took both lists, sorted them and pasted the two columns next to each other in a separate file. Alone this wasn’t very interesting but when I ran different grep searches for words it would sometimes yield some interesting stuff.

Grep eco

become about

become about

becomes achieve

ecosystem as

for become

incentive economic

second not

socio-economic of

was second

 

Grep tech

technology. of

whose technology

willing technology

with technology

with technology

world technology

 

I thought this was an OK application of terminal but I felt like I could have gotten more out of this if I hadn’t spent so much time struggling with the weird line lengths. I would have preferred being able to pair words from both texts in a more complete and contextual way. However I was happy that I figured out how to compress most of this process into a couple lines in terminal. Mostly it was:

fold <CapstoneEcon.txt | cut -d ‘ ‘ -f 2 | sort >AlpListwordsEcon.txt

fold <geogcap.txt | cut -d ‘ ‘ -f 2 | sort >GeogFoldSort.txt

paste AlpListwordsEcon.txt GeogFoldSort.txt >CombinedEconGeog.txt

 

 

 

 

Read More