FAQ Search Today's Posts Mark Forums Read
» Video Reviews

» Linux Archive

Linux-archive is a website aiming to archive linux email lists and to make them easily accessible for linux users/developers.


» Sponsor

» Partners

» Sponsor

Go Back   Linux Archive > Ubuntu > Kubuntu User

 
 
LinkBack Thread Tools
 
Old 08-03-2008, 04:52 AM
"Mark A. Taff"
 
Default Make a word list from a text

On Saturday 02 August 2008 15:06:22 John DeCarlo wrote:
> On Sat, Aug 2, 2008 at 2:17 AM, Mark A. Taff <marktaff@comcast.net> wrote:
> > This version will remove most punctuation, notably except apostrophe's.
> > You
> > start running into context problems: Is that apostrophe marking a plural
> > (mark's computer)
>
> Oops, you mean "possessive", not plural.

Yes, and plural possessive, too. ;-) I was neither clear nor correct, but it
seems everyone understood, thankfully.

--Mark

--
kubuntu-users mailing list
kubuntu-users@lists.ubuntu.com
Modify settings or unsubscribe at: https://lists.ubuntu.com/mailman/listinfo/kubuntu-users
 
Old 08-03-2008, 07:42 AM
Wulfy
 
Default Make a word list from a text

Donn wrote:
> On Saturday, 02 August 2008 05:52:46 Wulfy wrote:
>
>> I want to take a text file and extract all the words and sort them into
>> a unique list.
>>
> I gave it a go and this is the best I can do:
> cat myfile | sed "s/'//g" | tr -s '[:space:][unct:]' "
" | sort | uniq -c
>
> The sed bit is to remove single quotes so words like "didn't" don't
> become "didn" and "t". It then uses tr to replace spaces or punctuation with
> newlines and then out to sort and uniq.
>
> I find text parsing very hard to do. There seem to be corner-cases everywhere.
> What is a word really? How do you define it's edges? Ah well, HTH.
> d
>
>
Wow! Yet another way to do it! I said there were a bazillion
ways...... :@D

Thanks, Donn! :@)

--
Blessings

Wulfmann

Wulf Credo:
Respect the elders. Teach the young. Co-operate with the pack.
Play when you can. Hunt when you must. Rest in between.
Share your affections. Voice your opinion. Leave your Mark.
Copyright July 17, 1988 by Del Goetz


--
kubuntu-users mailing list
kubuntu-users@lists.ubuntu.com
Modify settings or unsubscribe at: https://lists.ubuntu.com/mailman/listinfo/kubuntu-users
 
Old 08-04-2008, 12:45 PM
Alexander Smirnov
 
Default Make a word list from a text

Donn пишет:
> On Saturday, 02 August 2008 05:52:46 Wulfy wrote:
>
>> I want to take a text file and extract all the words and sort them into
>> a unique list.
>>
> I gave it a go and this is the best I can do:
> cat myfile | sed "s/'//g" | tr -s '[:space:][unct:]' "
" | sort | uniq -c
>
>
>
Good! but cat is not necessary.

sed [OPTION]... {script-only-if-no-other-script} [input-file]...


--
kubuntu-users mailing list
kubuntu-users@lists.ubuntu.com
Modify settings or unsubscribe at: https://lists.ubuntu.com/mailman/listinfo/kubuntu-users
 
Old 08-04-2008, 12:59 PM
Donn
 
Default Make a word list from a text

On Monday, 04 August 2008 14:45:14 Alexander Smirnov wrote:
> sed [OPTION]... {script-only-if-no-other-script} [input-file]...
Cool. sed and I are verrrry uneasy pals

d


--
Don't look now, but there's one too many in this room and I think it's you. --
Groucho Marx

Fonty Python and other dev news at:
http://otherwiseingle.blogspot.com/

--
kubuntu-users mailing list
kubuntu-users@lists.ubuntu.com
Modify settings or unsubscribe at: https://lists.ubuntu.com/mailman/listinfo/kubuntu-users
 

Thread Tools




All times are GMT. The time now is 02:20 AM.

VBulletin, Copyright ©2000 - 2014, Jelsoft Enterprises Ltd.
Content Relevant URLs by vBSEO ©2007, Crawlability, Inc.
Copyright 2007 - 2008, www.linux-archive.org