FAQ Search Today's Posts Mark Forums Read
» Video Reviews

» Linux Archive

Linux-archive is a website aiming to archive linux email lists and to make them easily accessible for linux users/developers.


» Sponsor

» Partners

» Sponsor

Go Back   Linux Archive > Gentoo > Gentoo Development

 
 
LinkBack Thread Tools
 
Old 06-08-2011, 06:01 PM
Vikraman
 
Default Gentoo package statistics -- GSoC 2011

On Wed, Jun 08, 2011 at 05:19:33PM +0200, "Paweł Hajdan, Jr." wrote:
> On 6/8/11 4:36 PM, Vikraman wrote:
> > I'm working on the `Package statistics` project this year. Till now, I
> > have managed to write a client and server[0] to collect the following
> > information from hosts:
>
> Excellent, good luck with the idea! I think that better information
> about how Gentoo is actually used will greatly help improving it.
>

Well, that information cannot be collected automatically, can it ?

> > Is there a need to collect files installed by a package ? Doesn't PFL[1]
> > already provide that ?
>
> Well, PFL is not an official Gentoo project. It might be useful, but I
> wouldn't say it's a priority.
>
> > Please provide some feedback on what other data should be collected, etc.
>
> In my opinion it's *not* about collecting as much data as possible. I
> think it's most important to get the core functionality working really
> well, and convincing as large percentage of users as possible to enable
> reporting the statistics (to make the results - hopefully - accurately
> represent the user base). Please note that in some cases it may mean
> collecting _less_ data, or thinking more about the privacy of the users.
>
> For me, as a developer, even a list of packages sorted by popularity
> (aka Debian/Ubuntu popcon) would be very useful.
>
> Ah, and maybe files in /etc/portage: package.keywords and so on. It
> could be useful to see what people are masking/unmasking, that may be an
> indication of stale stabilizations or brokenness hitting the tree.
> Anyway, I'd call it an enhancement.
>
> > Also, I'm starting work on the webUI, and would like some
> > recommendations for stats pages, such as:
> >
> > * Packages installed sorted by users
>
> Cool!
>
> > * Top arches, keywords, profiles
>
> And percentage of ~arch vs arch users?
>
> > * Most enabled, disabled useflags per package/globally
>
> Also great, especially the per-package variant. It'd be also useful to
> have per-profile data, to better tune the profile defaults.
>
> > [0]
> > http://git.overlays.gentoo.org/gitweb/?p=proj/gentoostats.git;a=commit;h=1b9697a090515d2a373e83b 1094d6e08ec405c02
>
> I took a quick look at the code. Some random comments:
>
> - it uses portage Python API a lot. But it's not stable, or at least not
> guaranteed to be stable. Have you considered using helpers like portageq
> (or eventually enhancing those helpers)?
>
> - make the licensing super-clear (a LICENSE file, possibly some header
> in every source file, and so on)
>
> - how about submitting the data over HTTPS and not HTTP to better help
> privacy?

Fair points, thanks!

>
> - don't leave exception handling as a TODO; it should be a part of your
> design, not an afterthought
>
> - instead of or in addition to the setup.txt file, how about just
> writing the real setup.py file for distutils?
>

Yes, these are part of my sub-goals for next week.

--
Vikraman
 
Old 06-08-2011, 06:07 PM
Vikraman
 
Default Gentoo package statistics -- GSoC 2011

On Wed, Jun 08, 2011 at 09:35:26PM +0400, Николай Антонов wrote:
> On 08.06.2011 18:36, Vikraman wrote:
> > Hi everyone,
> >
> > I'm working on the `Package statistics` project this year. Till now, I
> > have managed to write a client and server[0] to collect the following
> > information from hosts:
> >
> > * Uname, portage profile, timestamp of portage tree
> > * ARCH, CHOST, CFLAGS, CXXFLAGS, FFLAGS, LDFLAGS, MAKEOPTS
> > * ACCEPT_KEYWORDS, FEATURES, USE, LANG, SYNC, GENTOO_MIRRORS
> > * Repository, Keyword, Useflags (plus,minus,unset), Counter, Size,
> > and Build time for each installed package
> >
>
> May be collect hardware info & kernel configs too?
> For example cpuinfo, lspci and lsusb(?).

That's not part of package statistics. There's the smolt project for
hardware statistics.

>
> I think, that after 1-3 month after installing gentoo, user can(should)
> "receive" newsitem about participating in `Package statistics` project.
> This newsitem can contains short instruction how-to install and
> configure this tool. And even in other gentoo projects(for example write
> short wiki page)
>
> And, where can I found ebuilds to the `Package statistics` project?

The server hasn't been deployed yet, and ebuilds will be available soon!
>
> Sory for my english... and Good luck!
>

--
Vikraman
 
Old 06-08-2011, 06:28 PM
Donnie Berkholz
 
Default Gentoo package statistics -- GSoC 2011

On 17:19 Wed 08 Jun , "Paweł Hajdan, Jr." wrote:
> On 6/8/11 4:36 PM, Vikraman wrote:
> > I'm working on the `Package statistics` project this year. Till now, I
> > have managed to write a client and server[0] to collect the following
> > information from hosts:
>
> Excellent, good luck with the idea! I think that better information
> about how Gentoo is actually used will greatly help improving it.
>
> > Is there a need to collect files installed by a package ? Doesn't PFL[1]
> > already provide that ?
>
> Well, PFL is not an official Gentoo project. It might be useful, but I
> wouldn't say it's a priority.

I would love to see it happen, but it's more important to roll out a
minimal working solution now and add on later.

By combining installed files with USE flag settings, this project could
actually attempt to factor out which USE flags result in which files in
an automatic fashion. That would address one of the biggest objections
many people have had to such a package-to-file search engine.

It would also be pretty useful for some other GSoC projects, like the
ebuild generator and the auto dependency scanner.

--
Thanks,
Donnie

Donnie Berkholz
Sr. Developer, Gentoo Linux
Blog: http://dberkholz.com
 
Old 06-08-2011, 06:28 PM
Donnie Berkholz
 
Default Gentoo package statistics -- GSoC 2011

On 17:19 Wed 08 Jun , "Paweł Hajdan, Jr." wrote:
> On 6/8/11 4:36 PM, Vikraman wrote:
> > I'm working on the `Package statistics` project this year. Till now, I
> > have managed to write a client and server[0] to collect the following
> > information from hosts:
>
> Excellent, good luck with the idea! I think that better information
> about how Gentoo is actually used will greatly help improving it.
>
> > Is there a need to collect files installed by a package ? Doesn't PFL[1]
> > already provide that ?
>
> Well, PFL is not an official Gentoo project. It might be useful, but I
> wouldn't say it's a priority.

I would love to see it happen, but it's more important to roll out a
minimal working solution now and add on later.

By combining installed files with USE flag settings, this project could
actually attempt to factor out which USE flags result in which files in
an automatic fashion. That would address one of the biggest objections
many people have had to such a package-to-file search engine.

It would also be pretty useful for some other GSoC projects, like the
ebuild generator and the auto dependency scanner.

--
Thanks,
Donnie

Donnie Berkholz
Sr. Developer, Gentoo Linux
Blog: http://dberkholz.com
 
Old 06-08-2011, 06:55 PM
Hans de Graaff
 
Default Gentoo package statistics -- GSoC 2011

On Wed, 2011-06-08 at 23:31 +0530, Vikraman wrote:

> > Excellent, good luck with the idea! I think that better information
> > about how Gentoo is actually used will greatly help improving it.
> >
>
> Well, that information cannot be collected automatically, can it ?

You could pop up a window at random times and ask the user. So it can be
done. Whether it's a good idea …

Hans
 
Old 06-08-2011, 07:54 PM
"Francisco Blas Izquierdo Riera (klondike)"
 
Default Gentoo package statistics -- GSoC 2011

El 08/06/11 20:07, Vikraman escribió:
> On Wed, Jun 08, 2011 at 09:35:26PM +0400, Николай Антонов wrote:
>> On 08.06.2011 18:36, Vikraman wrote:
>>> Hi everyone,
>>>
>>> I'm working on the `Package statistics` project this year. Till now, I
>>> have managed to write a client and server[0] to collect the following
>>> information from hosts:
>>>
>>> * Uname, portage profile, timestamp of portage tree
>>> * ARCH, CHOST, CFLAGS, CXXFLAGS, FFLAGS, LDFLAGS, MAKEOPTS
>>> * ACCEPT_KEYWORDS, FEATURES, USE, LANG, SYNC, GENTOO_MIRRORS
>>> * Repository, Keyword, Useflags (plus,minus,unset), Counter, Size,
>>> and Build time for each installed package
>>>
>> May be collect hardware info & kernel configs too?
>> For example cpuinfo, lspci and lsusb(?).
> That's not part of package statistics. There's the smolt project for
> hardware statistics.
Well there is another reason about why you don't want' to log that:
Hardened users. Not having access to the kernel .config helps in making
the system more resilient to some attacks, as a result many hardened
users are very stubborn in not having the .config files published.
 
Old 06-08-2011, 08:26 PM
ross smith
 
Default Gentoo package statistics -- GSoC 2011

>>> May be collect hardware info & kernel configs too?
>>> For example cpuinfo, lspci and lsusb(?).
>> That's not part of package statistics. There's the smolt project for
>> hardware statistics.
> Well there is another reason about why you don't want' to log that:
> Hardened users. Not having access to the kernel .config helps in making
> the system more resilient to some attacks, as a result many hardened
> users are very stubborn in not having the .config files published.

I would really like to see a nice way to set what information I want
sent. Perhaps a config file in /etc ? Also, an option to see what
is being sent would be great.

I look forward to start contributing my machine's info.

-Ross
 
Old 06-10-2011, 11:10 PM
Sebastian Pipping
 
Default Gentoo package statistics -- GSoC 2011

On 06/08/2011 04:36 PM, Vikraman wrote:
> * Repository, Keyword, Useflags (plus,minus,unset), Counter, Size,
> and Build time for each installed package

How many operations do you expect for a submissions with 1000 packages
on SQL level? Will that be around 1000 inserts?

Best,



Sebastian
 
Old 06-11-2011, 02:06 AM
Vikraman
 
Default Gentoo package statistics -- GSoC 2011

On Sat, Jun 11, 2011 at 01:10:36AM +0200, Sebastian Pipping wrote:
> On 06/08/2011 04:36 PM, Vikraman wrote:
> > * Repository, Keyword, Useflags (plus,minus,unset), Counter, Size,
> > and Build time for each installed package
>
> How many operations do you expect for a submissions with 1000 packages
> on SQL level? Will that be around 1000 inserts?
>

One insert for each package entry, and one insert for every useflag.

> Best,
>
>
>
> Sebastian
>

--
Vikraman
 

Thread Tools




All times are GMT. The time now is 06:23 AM.

VBulletin, Copyright ©2000 - 2014, Jelsoft Enterprises Ltd.
Content Relevant URLs by vBSEO ©2007, Crawlability, Inc.
Copyright 2007 - 2008, www.linux-archive.org