Can anyone help a tf2 player with his CS 170 homework? I'm actually a junior business major, but took this as a substitute for another business class because I am slightly interested in this topic. Little did I know, the professor with the worst ratings at the school teaches it. When I try to Google a lot of things, they are more advanced things that we are not allowed to use in our code. When I go to tutoring I get the information I need, but it legitimately takes 6 hours out of my day to make any progress from it on one assignment alone. When I go talk to the professor, he gives me ambiguous and cryptic statements about how to complete the assignment. I just need someone I can refer to, take a quick look at my code, and tell me what in tarnation is happening and what I need to do to make it work. I know it seems kind of silly to come on here for help, but I am nearly 100% lost with this homework at times and at my wits end. I just want it all to end. Stress levels too high for something that can be made a lot easier.
rcrogueI just want it all to end.
life is worth living, friend
life is worth living, friend
The one I am currently looking at involves manipulation of text files in python. Naturally, this professor wants us to count every occurrence of the standalone word, "the" in the complete collection of Shakespeare's Works. Including "The", "the", and "THE". I have opened the file, read the file, split the file all in a function, but now I am struggling on how to count every occurrence of that word in the entire collection. How can I do this while being case insensitive? The worst part is it must be done within the scope of what we have covered so far, so I guess there will be some trial and error here if anybody comes up with a solution.
The python docs are great and should generally be your first place to look.
https://docs.python.org
The python docs are great and should generally be your first place to look.
https://docs.python.org
Take the entire string and make it lowercase with the str.lower then you can use a for loop to go through the entire string searching for the word “the” and then adding 1 to counter every time it does
or you could use str.count
Something like this
check below
or you could use str.count
Something like this
check below
Break the problem down into individual, manageable steps.
- convert the entire string to uppercase or lowercase
- get a word (defined by whitespace or punctuation)
- iterate through each word
- check if it matches "the" / "THE" depending on the case you made it
- increment a counter variable
Gotchas to look out for: words like "there" or "other" that contain "the", we avoid this problem by making sure that the characters around "the" are whitespace or punctuation. A few ways to do this would be by putting every word in a list and comparing the entire string, or making sure the characters surrounding "the" are not letters, or even matching " the " || "/"the " || "'the " and so on
I don't know python (c#/java guy) but when you've figured out how you want to do it for each individual part, converting it into code shouldn't be that hard.
Some advice, it can be good to buddy up with people taking the course who know their shit. Even if you're just acquaintances, you can likely get some input from their solutions or advice on improving yours.
[spoiler]For example:
- convert the entire string to uppercase or lowercase
- get a word (defined by whitespace or punctuation)
- iterate through each word
- check if it matches "the" / "THE" depending on the case you made it
- increment a counter variable
Gotchas to look out for: words like "there" or "other" that contain "the", we avoid this problem by making sure that the characters around "the" are whitespace or punctuation. A few ways to do this would be by putting every word in a list and comparing the entire string, or making sure the characters surrounding "the" are not letters, or even matching " the " || "/"the " || "'the " and so on[/spoiler]
I don't know python (c#/java guy) but when you've figured out how you want to do it for each individual part, converting it into code shouldn't be that hard.
Some advice, it can be good to buddy up with people taking the course who know their shit. Even if you're just acquaintances, you can likely get some input from their solutions or advice on improving yours.
pretty sure my code above works and is the most efficient way you can do it
Codingbat has a lot of good practice problems http://codingbat.com/python/String-2
Codingbat has a lot of good practice problems http://codingbat.com/python/String-2
Vulcan, your code will miss "the" at the beginning or end of the string.
>>> "the quick brown fox jumps over the lazy dog".count(" the ")
1
[code]>>> "the quick brown fox jumps over the lazy dog".count(" the ")
1[/code]
epiVulcan, your code will miss "the" at the beginning or end of the string.>>> "the quick brown fox jumps over the lazy dog".count(" the ") 1
Edited thank you
didn’t think about a possible “the” inside of a word, I’m assuming the professor means “the” as a word correct?
[code]>>> "the quick brown fox jumps over the lazy dog".count(" the ")
1[/code][/quote]
Edited thank you
didn’t think about a possible “the” inside of a word, I’m assuming the professor means “the” as a word correct?
VulcanepiVulcan, your code will miss "the" at the beginning or end of the string.Edited thank you>>> "the quick brown fox jumps over the lazy dog".count(" the ") 1
That doesn't work any better.
>>> "absinthe is an old sounding word".count("the ")
1
[code]>>> "the quick brown fox jumps over the lazy dog".count(" the ")
1[/code][/quote]
Edited thank you[/quote]
That doesn't work any better.
[code]>>> "absinthe is an old sounding word".count("the ")
1[/code]
VulcanI’m assuming the professor means “the” as a word correct?
Yessir. I'll try it out. Thank you.
Yessir. I'll try it out. Thank you.
str = “shakesphere mumbo jumbo”
number = str.count (“The “)
str = str.lower ()
return str.count (“ the “) + number
epi try this out?
number = str.count (“The “)
str = str.lower ()
return str.count (“ the “) + number
epi try this out?
On my way to tutoring to hopefully get this all sorted out. As well as this I have another assignment due tomorrow where we must find the harmonic mean from a list of numbers. Thanks for your help everyone, I will report back with results ;oo.
>>> vulcan_count("Theodore reads his thesaurus at the theater")
4
I think splitting the string or using regex is the better part of valor here. Making a list of this kind of rule can get out of hand fast.
4[/code]
I think splitting the string or using regex is the better part of valor here. Making a list of this kind of rule can get out of hand fast.
epi>>> vulcan_count("Theodore reads his thesaurus at the theater") 4
I think splitting the string or using regex is the better part of valor here. Making a list of this kind of rule can get out of hand fast.
Unless I'm mistaken, Vulcan's current code (when he took out " the" which should never have been there) should be fine for that, as long as capitalization is consistent and stuff like "THE QUICK BROWN FOX" never appears as a line in the text.
I would tell a later year cs student that they absolutely should rework that to better handle edge cases, but a first year non-cs student can get away with stuff like that.
4[/code]
I think splitting the string or using regex is the better part of valor here. Making a list of this kind of rule can get out of hand fast.[/quote]
Unless I'm mistaken, Vulcan's current code (when he took out " the" which should never have been there) should be fine for that, as long as capitalization is consistent and stuff like "THE QUICK BROWN FOX" never appears as a line in the text.
I would tell a later year cs student that they absolutely should rework that to better handle edge cases, but a first year non-cs student can get away with stuff like that.
JarateKingepiUnless I'm mistaken, Vulcan's current code (when he took out " the" which should never have been there) should be fine for that, as long as capitalization is consistent and stuff like "THE QUICK BROWN FOX" never appears as a line in the text.>>> vulcan_count("Theodore reads his thesaurus at the theater") 4
I think splitting the string or using regex is the better part of valor here. Making a list of this kind of rule can get out of hand fast.
I used “ the” because I was thinking of something like this “ djdjfkkdk the”
4[/code]
I think splitting the string or using regex is the better part of valor here. Making a list of this kind of rule can get out of hand fast.[/quote]
Unless I'm mistaken, Vulcan's current code (when he took out " the" which should never have been there) should be fine for that, as long as capitalization is consistent and stuff like "THE QUICK BROWN FOX" never appears as a line in the text.[/quote]
I used “ the” because I was thinking of something like this “ djdjfkkdk the”
Vulcan you walk over the string 3 times and it isn't that easy to understand.
Split on space and any punctuation (probably not apostrophes but I doubt it'd matter), then take each output string, convert to lower case then compare with "the"
You also assume perfect grammar which is kind of a shakey thing to do
Split on space and any punctuation (probably not apostrophes but I doubt it'd matter), then take each output string, convert to lower case then compare with "the"
You also assume perfect grammar which is kind of a shakey thing to do
It wouldn't count " the" at the end of the string. Granted, that's not valid English, but I wouldn't take that risk going off of what little I've read of Shakespeare. I've gotten wrong answers on problems like this in the past by making that kind of assumption.
gemmYou also assume perfect grammar which is kind of a shakey thing to do
Yeah I don’t have any way to verify my code and I’m assuming a couple of things for the sake of simplicity, I don’t know what he has learned or what exactly his teacher expects him to do so I’m just guessing
You also assume perfect grammar which is kind of a shakey thing to do[/quote]
Yeah I don’t have any way to verify my code and I’m assuming a couple of things for the sake of simplicity, I don’t know what he has learned or what exactly his teacher expects him to do so I’m just guessing
Given a string shakespeare containing
The quick brown fox jumps over the lazy dog thE.
THE other dog ate tHe thing .the but not there
# Use Python's built-in string containing all punctuation
from string import punctuation
count = 0
# .lower() makes the whole string lowercase, so you don't have to worry about
# case anymore
# .split() separates the string into a list of things in between whitespace
for word in <censored>
# .strip() removes all the characters in the argument (punctuation) from
# the front and end of the string
if <censored>:
count += 1
print(count)
Will print 6
Edit: I took out the complete solution
If you want to use functional programming, that ^ can be simplified to
from string import punctuation
print(sum([word.strip(punctuation) == 'the' \
for word in shakespeare.lower().split()]))
If you're allowed to use regexes it's a lot easier:
import re
print(len(re.findall(r'(?i)\bthe\b',shakespeare)))
[code]The quick brown fox jumps over the lazy dog thE.
THE other dog ate tHe thing .the but not there
[/code]
[code]
# Use Python's built-in string containing all punctuation
from string import punctuation
count = 0
# .lower() makes the whole string lowercase, so you don't have to worry about
# case anymore
# .split() separates the string into a list of things in between whitespace
for word in <censored>
# .strip() removes all the characters in the argument (punctuation) from
# the front and end of the string
if <censored>:
count += 1
print(count)
[/code]
Will print 6
Edit: I took out the complete solution
If you want to use functional programming, that ^ can be simplified to
[code]
from string import punctuation
print(sum([word.strip(punctuation) == 'the' \
for word in shakespeare.lower().split()]))
[/code]
If you're allowed to use regexes it's a lot easier: [code]import re
print(len(re.findall(r'(?i)\bthe\b',shakespeare)))[/code]
Lense
Since this is a very beginner class I’m assuming he doesn’t know about libraries and Regex yet
Since this is a very beginner class I’m assuming he doesn’t know about libraries and Regex yet
most of the solutions here are way overcomplicated. from my very brief searching there aren't any shakespeare sentences that end with the or have the followed by a comma, so something like
>>> text = "The quick brown fox jumps over the lazy dog"
>>> text.lower().split().count("the")
2
should be more than sufficient
also, none of the following is going to be helpful to the OP, but to get overly technical for fun:
which will be counted by
Lenseimport re print(len(re.findall(r'(?i)\bthe\b',shakespeare)))
but not by
Lensefrom string import punctuation print(sum([word.strip(punctuation) == 'the' \ for word in shakespeare.lower().split()]))
i think the "the" in this word should get counted so the regex solution is probably more thorough.
[code]
>>> text = "The quick brown fox jumps over the lazy dog"
>>> text.lower().split().count("the")
2
[/code]
should be more than sufficient
also, none of the following is going to be helpful to the OP, but to get overly technical for fun:
[spoiler]
in the tempest act 2 scene 2, trinculo says the word not-of-the-newest
which will be counted by
[quote=Lense][code]import re
print(len(re.findall(r'(?i)\bthe\b',shakespeare)))[/code][/quote]
but not by
[quote=Lense]
[code]
from string import punctuation
print(sum([word.strip(punctuation) == 'the' \
for word in shakespeare.lower().split()]))
[/code]
[/quote]
i think the "the" in this word should get counted so the regex solution is probably more thorough.
[/spoiler]
there are probably instances in which The is the first word in a quote so u should probably use strip() as well
Hey guys, thanks a ton for the help and ideas. I really appreciate it, and it is nice knowing that ideas can be generated here if I am ever stuck and can't find an answer elsewhere. After some painful discussions in tutoring, the assignment was not as terrible as I thought it was, and makes the code incredibly more simple. Evidently punctuation is not a factor in the assignment. I should have probably noted that, and you guys would have figured out this answer yourselves a lot faster. So, sorry about that, and also, yes, we are not allowed to use Regex just yet.
I oped the file, read the file, lower cased all of it, split it all, set accumulator for counting later, started a for loop for word in (what I defined as) lowerWordList (being my split list), and within that put an if statement saying that if word == "the" then add one on the accumulator, and lastly returned the accumulator for the number I need. The second part of the assignment was to find any instance of the word "the" not only as a standalone, but also within other words, such as thee or prithee. So for that all that needed to be done was put [:3] after word in the if statement to index the first three letters of "the", of course being only the entire word.
BTW, there are 27,730 instances of the standalone word "the" in the project Gutenberg .txt for the complete works of William Shakespeare, and 44,532 instances of the word "the" as a standalone and within other words of the complete works. Thanks again ;)) feelsgoodman
I oped the file, read the file, lower cased all of it, split it all, set accumulator for counting later, started a for loop for word in (what I defined as) lowerWordList (being my split list), and within that put an if statement saying that if word == "the" then add one on the accumulator, and lastly returned the accumulator for the number I need. The second part of the assignment was to find any instance of the word "the" not only as a standalone, but also within other words, such as thee or prithee. So for that all that needed to be done was put [:3] after word in the if statement to index the first three letters of "the", of course being only the entire word.
BTW, there are 27,730 instances of the standalone word "the" in the project Gutenberg .txt for the complete works of William Shakespeare, and 44,532 instances of the word "the" as a standalone and within other words of the complete works. Thanks again ;)) feelsgoodman