FAQ Search Today's Posts Mark Forums Read

» Linux Archive
Home
New Posts
Search
FAQ



 
 
LinkBack Thread Tools
 
Old 07-27-2008, 01:25 AM
seth vidal
 
Default Election Data

On Sat, 2008-07-26 at 20:05 -0400, Josh Boyer wrote:
> On Sat, 2008-07-26 at 12:11 -0600, Stephen John Smoogen wrote:
> > On Sat, Jul 26, 2008 at 11:55 AM, Josh Boyer <jwboyer@gmail.com> wrote:
> > > On Fri, 2008-07-25 at 21:08 -0400, seth vidal wrote:
> > >> On Fri, 2008-07-25 at 17:05 -0400, David Woodhouse wrote:
> > >> > I'd be very disappointed if we refused to release _anonymised_ vote data
> > >> > purely on the basis that we think there might be some nutter out there
> > >> > who wouldn't come out from under his table for a few days if we did so.
> > >>
> > >> I'd be disappointed if we were yet another data point of groups who do
> > >> not handle their users information w/care.
> > >>
> > >> It's such a cliche.
> > >
> > > Can you explain how it wouldn't be handled with care if it was
> > > anonymized?
> > >
> >
> > The issue is that the board is the steward of the data. How long does
> > the data get kept (what is Fedora's data retention policy?) and who is
> > allowed access to it is something the board should consider. Not just
> > for useful research, but fishing expeditions by some British Ministry
> > to see if David Woodhouse was voting or going to the Dr on such a date
> > and can be held for an additional 40 days because he forgot to mention
> > that when questioned. [Now David may think thats an ok situation, but
> > I would lose some sleep over it.. and I am just being selfish here.]
>
> Explain to me releasing ANONYMOUS voting data would implicate anyone.
>
> > People may also have some 'legal' expectation of privacy unless told
> > otherwise by banners and signed agreements (updated CLA's). This would
> > also affect whether the board could give the data out (or have to do
> > some such thing that any member who comes from Netherlands can't have
> > their data aggregated with sets given out unless they were told it was
> > going to be done).
>
> Again, how is privacy lost if the data is anonymous.
>

Sufficient anonymization would mean the data would end up being:

AAA:BBBBBBBB:CCC

Seriously, there's no good way to anonymize it enough w/o making it
useless.

More to the point, no one would believe it was sufficiently anonymized.

-sv



_______________________________________________
fedora-advisory-board mailing list
fedora-advisory-board@redhat.com
http://www.redhat.com/mailman/listinfo/fedora-advisory-board
 
Old 07-27-2008, 01:44 AM
"CLAY S"
 
Default Election Data

On Sat, Jul 26, 2008 at 17:25, seth vidal <skvidal@fedoraproject.org> wrote:

Sufficient anonymization would mean the data would end up being:



AAA:BBBBBBBB:CCC
Well, no.* You would have:

ballot 001: {Jones : 10, Smith : 9, Adams : 0}
ballot 002: {Jones : 6, Smith : 0, Adams 10}
etc.

Maybe a particular individual could identify his ballot if only one ballot was like the one he cast.* But then only _he_ knows that's his ballot.* There's no privacy disclosure issue here.


Now, there is an issue that he could sell his vote in this case.* But I don't imagine that's a serious problem for you.
*

Seriously, there's no good way to anonymize it enough w/o making it useless.
Huh?


More to the point, no one would believe it was sufficiently anonymized.
You don't have to "believe" the data is anonymized.* You can prove it.* If there's any information that identifies the identity of the voter, it's not anonymized.


_______________________________________________
fedora-advisory-board mailing list
fedora-advisory-board@redhat.com
http://www.redhat.com/mailman/listinfo/fedora-advisory-board
 
Old 07-28-2008, 03:37 AM
Mike McGrath
 
Default Election Data

On Sat, 26 Jul 2008, seth vidal wrote:
>
> Sufficient anonymization would mean the data would end up being:
>
> AAA:BBBBBBBB:CCC
>
> Seriously, there's no good way to anonymize it enough w/o making it
> useless.
>

And the trick / trap of the thing is my definition of anonymous and
$RANDOM_PRIVACY_ADVOCATE's definition of anonymous might be completely
different and if we've already released the results without us notifying
them first then we've caused a problem.

Seth's got it right, lets not be one of those organizations who doesn't
handle their user info with care.

-Mike

_______________________________________________
fedora-advisory-board mailing list
fedora-advisory-board@redhat.com
http://www.redhat.com/mailman/listinfo/fedora-advisory-board
 
Old 07-28-2008, 03:43 AM
"CLAY S"
 
Default Election Data

I don't see how there are different definitions of anonymous.* Either the data set includes information about the voter, or it doesn't.

On Sun, Jul 27, 2008 at 19:37, Mike McGrath <mmcgrath@redhat.com> wrote:

On Sat, 26 Jul 2008, seth vidal wrote:

>

> Sufficient anonymization would mean the data would end up being:

>

> AAA:BBBBBBBB:CCC

>

> Seriously, there's no good way to anonymize it enough w/o making it

> useless.

>



And the trick / trap of the thing is my definition of anonymous and

$RANDOM_PRIVACY_ADVOCATE's definition of anonymous might be completely

different and if we've already released the results without us notifying

them first then we've caused a problem.



Seth's got it right, lets not be one of those organizations who doesn't

handle their user info with care.



* * * *-Mike



_______________________________________________

fedora-advisory-board mailing list

fedora-advisory-board@redhat.com

http://www.redhat.com/mailman/listinfo/fedora-advisory-board



--
clay shentrup
phone: 206.801.0484

"Iraq? No, YOU rock!"

_______________________________________________
fedora-advisory-board mailing list
fedora-advisory-board@redhat.com
http://www.redhat.com/mailman/listinfo/fedora-advisory-board
 
Old 07-28-2008, 07:13 AM
Nigel Jones
 
Default Election Data

CLAY S wrote:

To whom it may concern:

As a concerned citizen, deeply committed to improving the long term
peace and prosperity of my species, I am requesting *anonymous* ballot
results for your recent score voting elections - purely for scientific
study. It is my sincere belief that such data, however anecdotal it
might be, is the closest we can come to the sort of ballot data we
would see if score voting had been used in political elections, since
the Fedora elections are actually consequential (unlike, say, polls).


Here are some links which underscore my sense that this issue is
incredibly import for humanity's long-term best interest:

http://rangevoting.org/LivesSaved.html
http://rangevoting.org/RelImport.html
http://rangevoting.org/WorldProblems.html

This sort of data was made available for the HaikuOS icon selection:
http://rangevoting.org/HaikuIcon.html
Like Seth, Matt, Mike & Co, I object to this for a variety of reasons,
and here they are (a comes from my perspective as a Fedora Contributor,
b comes from a perspective of someone who has dealt a bit with
statistics, and c is on a personal & elections admin perspective):


a) Assuming you are going to do a similar publication to that of the
Haiku thing, then a very simple answer, if your trying to prove a point,
then your not doing a very good job 'and a lot of "abstentions." Problem
is, we can't tell abstentions from 0's. (There are no 0's, so that's my
assessment.)' this exactly how any data you'd get from Fedora Project
would be (0 = abstain OR no immediate preference), where _I'VE_ voted
zero in the ballots, I would have voted zero even if I did have a no
preference option.

So my concern here is:
You can't compare apples with oranges and any attempt to do so would
offend me as a user of Fedora, it also makes me disregard the rest of
the 'study' as phony because it gives no statistical burden of proof


b) It's a bit steep to just come along and ask for _complete_ voting
records (anonymous or not), why haven't you asked for a simple random
sample (SRS) I would have thought of any such request much more
favourably (in light of point a) than not.


c) There is the privacy of the votes, I make my vote in confidence
(unless I'm told beforehand) that my choices are between me and no one
else, and will ONLY be seen for purposes of tallying the votes. Why
should we treat our users votes any differently than a responsible
government (i.e. not Zimbabwe). There is _no_ reason why, and any
attempt to change that I'd find disgraceful. Oh and please none of the
"Fedora is open" yada-yada because I know that, I've known that for
ages, but there is no way anyone can say everything is 100% transparent
because the Board, been effectively a governing body of Fedora holds
meetings is private, they release the aggregate information out (what
they can) but they can't release everything... Why? Because there are
just some things that _shouldn't_ be aired in the public arena, what can
be is kept for town-hall meetings.


My thoughts for the Board's consideration:
- Why should a vote in a Fedora election be any different to that of
nearly any governing body (heck, Public Companies don't release
non-aggregate vote data that I've seen)?

- Please consider future implications of such a move (to release data)
- Can we _please_ create some sort of policy to protect the voting data
(at the very least as a whole) in retrospect and for the future?


On another note, it should be made clear, that even I have _never_ seen
individual vote information (anonymously or associated), there has been
no reason to (yes I do check for invalid votes, but none have been
displayed and the queries have always been designed to avoid showing
valid data).


Debian also makes their election data public, though they use a worse
and much more complex Condorcet method, called "Shulze".

http://www.debian.org/vote/2003/leader2003_tally.txt
Yes, and people vote with the knowledge that such releases happen, I
honestly don't see anything wrong from there perspective. As for
immediately discounting as 'worse', I don't see how you can claim that,
it's worked fairly well for them and I'm all for what works.


- Nigel
(n.b. I'm sure I've aired other concerns on IRC but I think I've
summarised most of them here)


_______________________________________________
fedora-advisory-board mailing list
fedora-advisory-board@redhat.com
http://www.redhat.com/mailman/listinfo/fedora-advisory-board
 
Old 07-28-2008, 07:56 AM
"CLAY S"
 
Default Election Data

On Sun, Jul 27, 2008 at 23:13, Nigel Jones <dev@nigelj.com> wrote:

a) Assuming you are going to do a similar publication to that of the Haiku thing, then a very simple answer, if your trying to prove a point, then your not doing a very good job 'and a lot of "abstentions." Problem is, we can't tell abstentions from 0's. (There are no 0's, so that's my assessment.)' this exactly how any data you'd get from Fedora Project would be (0 = abstain OR no immediate preference)

I'm surprised your interface doesn't have a clear distinction between a zero and an abstention.* But I'm not sure what point you're trying to make.* Not being able to tell 0's from abstentions is an "inadequacy" in the data.* But there is still lots of valuable information in that data set.* Even if we do not know for certain that a ballot with _no_ intermediate scores was intentionally strategically exaggerated, it is still plausible that it is, and still quite relevant to election researchers.


, where _I'VE_ voted zero in the ballots, I would have voted zero even if I did have a no preference option.

Then the data set would have registered your zero, which is precisely what we'd want.* Again I'm not sure what your point is.


So my concern here is:

You can't compare apples with oranges and any attempt to do so would offend me as a user of Fedora
Can you put that into a formal mathematical statement?* Telling us that we can't compare "apples and oranges" and that it offends you is most uninformative.


it also makes me disregard the rest of the 'study' as phony because it gives no statistical burden of proof

It's not phone.* It's real data.* And please tell me, in as formal language as possible, what a "statistical burden of proof" is.* Could you be talking about a confidence interval perhaps?


b) It's a bit steep to just come along and ask for _complete_ voting records (anonymous or not), why haven't you asked for a simple random sample (SRS) I would have thought of any such request much more favourably (in light of point a) than not.

I specifically raised that option in one of the earliest emails in this thread.* I put it something like, "a random subset of the ballots".
*

c) There is the privacy of the votes, I make my vote in confidence (unless I'm told beforehand) that my choices are between me and no one else, and will ONLY be seen for purposes of tallying the votes. *Why should we treat our users votes any differently than a responsible government (i.e. not Zimbabwe). *There is _no_ reason why, and any attempt to change that I'd find disgraceful.

With all due respect, this seems irrational to me if the data is anonymous, especially if a random sample is provided.* If even a voter _himself_ cannot tell that a particular ballot in the data set is his, clearly his privacy has not been violated.


Although as I already said, I'd be more than happy to wait until future elections have a notice that the anonymized ballot data will be used for transparency/research.
*

Oh and please none of the "Fedora is open" yada-yada because I know that, I've known that for ages, but there is no way anyone can say everything is 100% transparent because the Board, been effectively a governing body of Fedora holds meetings is private, they release the aggregate information out (what they can) but they can't release everything... *Why? *Because there are just some things that _shouldn't_ be aired in the public arena, what can be is kept for town-hall meetings.
*
Agreed.* Can you think of any sound reasons the ballot data shouldn't be aired in public?


My thoughts for the Board's consideration:

- Why should a vote in a Fedora election be any different to that of nearly any governing body (heck, Public Companies don't release non-aggregate vote data that I've seen)?
That's a logical fallacy - an appeal to tradition/popularity.* If the evidence says that disclosing the anonymized data is harmless and even has some benefits, then it is _those companies_ who are doing it wrong, and so citing them as an example doesn't bolster your case.


- Please consider future implications of such a move (to release data)

Are there some negative implications you can think of?

- Can we _please_ create some sort of policy to protect the voting data (at the very least as a whole) in retrospect and for the future?

Why not just create a policy of telling the voters beforehand that the data will be made public but anonymous?
*

On another note, it should be made clear, that even I have _never_ seen individual vote information (anonymously or associated), there has been no reason to (yes I do check for invalid votes, but none have been displayed and the queries have always been designed to avoid showing valid data).

Again I fail to see what point you're making.
*

Yes, and people vote with the knowledge that such releases happen, I honestly don't see anything wrong from there perspective. *As for immediately discounting as 'worse', I don't see how you can claim that, it's worked fairly well for them and I'm all for what works.

You have no way of knowing that "it's worked fairly well for them", since you can't read the voters' minds to calculate the Bayesian regret they experienced through it.* That can only be gotten via computer simulation, where you _can_ read the voters' minds.


http://rangevoting.org/BayRegDum.html

Some sample B.R. figures for Condorcet and scoring:


Magically elect
optimum winner 100.00% Range (honest voters)
96.71% Condorcet-LR (honest voters)
85.19% Range & Approval (strategic exaggerating voters)
78.99%
Condorcet-LR (strategic exaggerating voters) 42.56%
Elect random winner 0.00%
-clay

_______________________________________________
fedora-advisory-board mailing list
fedora-advisory-board@redhat.com
http://www.redhat.com/mailman/listinfo/fedora-advisory-board
 
Old 07-28-2008, 08:19 AM
"CLAY S"
 
Default Election Data

Let me also say this.

We can argue till the cows come home, but I think it is probably already clear which people have the authority to make this decision, and it is probably already clear to them what the best decision is.* So continuing to debate it is academic.* I'm simply trying to emphasize that this data holds great value to humanity, and so releasing it in the future -- after dutifully informing voters about it beforehand -- seems to me to be a very good thing.


At http://rangevoting.org/LivesSaved.html, Warren Smith begins:
==
During 2000-2050, the world will face several crises. These include: the end of cheap oil, various "fossil water" resources running out, global fisheries species collapse, USA federal bankruptcy, overpopulation, nuclear and bioweapons proliferation, and climate change. These could easily bring about the "end of modern civilization." Far as I can see, world population and consumption levels are already well beyond what can be supported with renewable resources, therefore a population decline is inevitable.


In view of that, the world needs to make good decisions. But the "decision making algorithm for the world" is (to a close approximation, with the USA the only "superpower") the same as "the USA's horrible voting system."


That isn't good enough. Range voting is a far better decision-making algorithm.
==

That is the context in which I view this entire issue.* Doubling the effect of democracy over non-democracy is an opportunity which far exceeds the value of competing reforms.* But the vast majority of humans do not know how big of a problem this is, and so there are enormous obstacles to the implementation of score-based voting.* In order to ultimately have success at implementing score voting in political elections, we need to have as much information as possible about its behavior in real "contentious" elections.* Fedora elections may not inspire the same kind of passionate voting as the race between Clinton and Obama did, but they are significant, and attended by some of the smartest people in society.* The scarcity of data from scoring elections is particularly significant considering that the Center for Range Voting only began in 2005 - and before that point, at which Smith's Bayesian regret data became actively publicized, the conventional wisdom was that scoring would not work as a voting method.* So this makes the Fedora data even more valuable.


Bottom line is this.* I do not want to disrespect the privacy of your voters.* But I do want to use this limited lifespan to do everything possible to save my species from destroying itself in the not-too-distant future.* Considering what's at stake, I believe it is not too unreasonable for the Fedora community to change policy to make future election data available to the public - or at least to interested activists who want to research it.* I just don't think there's any real harm in it, as long as you've respected the voters by letting them know.


Regards,
Clay "Savin' the Planet" Shentrup

_______________________________________________
fedora-advisory-board mailing list
fedora-advisory-board@redhat.com
http://www.redhat.com/mailman/listinfo/fedora-advisory-board
 
Old 07-28-2008, 06:06 PM
Toshio Kuratomi
 
Default Election Data

CLAY S wrote:
Considering what's at stake, I believe it is
not too unreasonable for the Fedora community to change policy to make
future election data available to the public


Really, I'd like this to be put to a vote by Fedora Contributors.
Because the real question for releasing anonymized election data for the
future is: will people continue to vote if they know the data will be
released anonymously.


I see no reason they wouldn't. But if 25% of our active voters say they
would not vote if that were the case, then that's something we should
listen to.


*Note: This is a slightly difficult thing to deal with in terms of
making the decision. A simple majority of voters shouldn't be able to
open the election data as what we want to avoid doing is alienating
voters. Making the results non-binding isn't a very good idea either.
Perhaps setting a percentage beforehand. If the percentage of people
who say they'll stop voting if the election data is released anonymously
is greater than a certain amount, then we won't do that.


Also -- we should be sure we are clear about what we are talking about
releasing: anonymized individual ballots vs aggregate statistics (which
we already do release for some elections). and how we're going to do it
(what tables exist in the db. How we can select information from the
tables so as not to associate the user with the data).


-Toshio

_______________________________________________
fedora-advisory-board mailing list
fedora-advisory-board@redhat.com
http://www.redhat.com/mailman/listinfo/fedora-advisory-board
 
Old 07-28-2008, 06:31 PM
Chris Tyler
 
Default Election Data

On Mon, 2008-07-28 at 10:06 -0700, Toshio Kuratomi wrote:
> CLAY S wrote:
> > Considering what's at stake, I believe it is
> > not too unreasonable for the Fedora community to change policy to make
> > future election data available to the public
>
> Really, I'd like this to be put to a vote by Fedora Contributors.
> Because the real question for releasing anonymized election data for the
> future is: will people continue to vote if they know the data will be
> released anonymously.

Perhaps a good solution is to allow participants to opt in on a per-vote
basis: a checkbox that says "I consent to having my anonymized vote
published" could appear right on the vote form. (Vote data should also
be randomly ordered so that time-correlation is not possible).

> I see no reason they wouldn't. But if 25% of our active voters say they
> would not vote if that were the case, then that's something we should
> listen to.

Definitely that the last thing we want to do is reduce our already-low
voter turnout.

-Chris

_______________________________________________
fedora-advisory-board mailing list
fedora-advisory-board@redhat.com
http://www.redhat.com/mailman/listinfo/fedora-advisory-board
 
Old 07-28-2008, 07:30 PM
Jesse Keating
 
Default Election Data

On Mon, 2008-07-28 at 13:31 -0400, Chris Tyler wrote:
> Perhaps a good solution is to allow participants to opt in on a per-vote
> basis: a checkbox that says "I consent to having my anonymized vote
> published" could appear right on the vote form. (Vote data should also
> be randomly ordered so that time-correlation is not possible).

This I like best, because it's a per voter, per instance thing, rather
than a set of voters today deciding things for sets of voters in the
future.

--
Jesse Keating
Fedora -- Freedom˛ is a feature!
_______________________________________________
fedora-advisory-board mailing list
fedora-advisory-board@redhat.com
http://www.redhat.com/mailman/listinfo/fedora-advisory-board
 

Thread Tools




All times are GMT. The time now is 06:00 AM.

VBulletin, Copyright ©2000 - 2008, Jelsoft Enterprises Ltd.
Content Relevant URLs by vBSEO ©2007, Crawlability, Inc.
Copyright ©2007 - 2008, www.linux-archive.org