FAQ Search Today's Posts Mark Forums Read
» Video Reviews

» Linux Archive

Linux-archive is a website aiming to archive linux email lists and to make them easily accessible for linux users/developers.


» Sponsor

» Partners

» Sponsor

Go Back   Linux Archive > Redhat > Device-mapper Development

 
 
LinkBack Thread Tools
 
Old 08-19-2012, 01:16 PM
Mathieu Desnoyers
 
Default hashtable: introduce a small and naive hashtable

* Sasha Levin (levinsasha928@gmail.com) wrote:
> This hashtable implementation is using hlist buckets to provide a simple
> hashtable to prevent it from getting reimplemented all over the kernel.
>
> Signed-off-by: Sasha Levin <levinsasha928@gmail.com>
> ---
> include/linux/hashtable.h | 284 +++++++++++++++++++++++++++++++++++++++++++++
[...]

Hi Sasha,

There are still a few API naming nits that I'd like to discuss:

> +
> +/**
> + * hash_for_each_size - iterate over a hashtable
> + * @name: hashtable to iterate
> + * @bits: bit count of hashing function of the hashtable
> + * @bkt: integer to use as bucket loop cursor
> + * @node: the &struct list_head to use as a loop cursor for each bucket
> + * @obj: the type * to use as a loop cursor for each bucket
> + * @member: the name of the hlist_node within the struct
> + */
> +#define hash_for_each_size(name, bits, bkt, node, obj, member)

What is the meaning of "for each size" ?

By looking at the implementation, I see that it takes an extra "bits"
argument to specify the key width.

But in the other patches of this patchset, I cannot find a single user
of the "*_size" API. If you do not typically expect users to specify
this parameter by hand (thanks to use of HASH_BITS(name) in for_each
functions that do not take the bits parameter), I would recommend to
only expose hash_for_each() and similar defines, but not the *_size
variants.

So I recommend merging hash_for_each_size into hash_for_each (and
doing similarly for other *_size variants). On the plus side, it will
cut down the number of for_each macros from 12 down to 6, which is more
reasonable.


> + for (bkt = 0; bkt < HASH_SIZE(bits); bkt++)
> + hlist_for_each_entry(obj, node, &name[bkt], member)
> +
> +/**
> + * hash_for_each - iterate over a hashtable
> + * @name: hashtable to iterate
> + * @bkt: integer to use as bucket loop cursor
> + * @node: the &struct list_head to use as a loop cursor for each bucket
> + * @obj: the type * to use as a loop cursor for each bucket
> + * @member: the name of the hlist_node within the struct
> + */
> +#define hash_for_each(name, bkt, node, obj, member)
> + hash_for_each_size(name, HASH_BITS(name), bkt, node, obj, member)
> +

[...]

> +/**
> + * hash_for_each_possible - iterate over all possible objects for a given key
> + * @name: hashtable to iterate
> + * @obj: the type * to use as a loop cursor for each bucket
> + * @bits: bit count of hashing function of the hashtable
> + * @node: the &struct list_head to use as a loop cursor for each bucket
> + * @member: the name of the hlist_node within the struct
> + * @key: the key of the objects to iterate over
> + */
> +#define hash_for_each_possible_size(name, obj, bits, node, member, key)
> + hlist_for_each_entry(obj, node, &name[hash_min(key, bits)], member)

Second point: "for_each_possible" does not express the iteration scope.
Citing WordNet: "possible adj 1: capable of happening or existing;" --
which has nothing to do with iteration on duplicate keys within a hash
table.

I would recommend to rename "possible" to "duplicate", e.g.:

hash_for_each_duplicate()

which clearly says what is the scope of this iteration: duplicate keys.

Thanks,

Mathieu

--
Mathieu Desnoyers
Operating System Efficiency R&D Consultant
EfficiOS Inc.
http://www.efficios.com

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel
 
Old 08-19-2012, 02:16 PM
Mathieu Desnoyers
 
Default hashtable: introduce a small and naive hashtable

* Mathieu Desnoyers (mathieu.desnoyers@efficios.com) wrote:
> * Sasha Levin (levinsasha928@gmail.com) wrote:
[...]
> > +/**
> > + * hash_for_each_possible - iterate over all possible objects for a given key
> > + * @name: hashtable to iterate
> > + * @obj: the type * to use as a loop cursor for each bucket
> > + * @bits: bit count of hashing function of the hashtable
> > + * @node: the &struct list_head to use as a loop cursor for each bucket
> > + * @member: the name of the hlist_node within the struct
> > + * @key: the key of the objects to iterate over
> > + */
> > +#define hash_for_each_possible_size(name, obj, bits, node, member, key)
> > + hlist_for_each_entry(obj, node, &name[hash_min(key, bits)], member)
>
> Second point: "for_each_possible" does not express the iteration scope.
> Citing WordNet: "possible adj 1: capable of happening or existing;" --
> which has nothing to do with iteration on duplicate keys within a hash
> table.
>
> I would recommend to rename "possible" to "duplicate", e.g.:
>
> hash_for_each_duplicate()
>
> which clearly says what is the scope of this iteration: duplicate keys.

OK, about this part: I now see that you iterate over all objects within
the same hash chain. I guess the description "iterate over all possible
objects for a given key" is misleading: it's not all objects with a
given key, but rather all objects hashing to the same bucket.

I understand that you don't want to build knowledge of the key
comparison function in the iterator (which makes sense for a simple hash
table).

By the way, the comment "@obj: the type * to use as a loop cursor for
each bucket" is also misleading: it's a loop cursor for each entry,
since you iterate on all nodes within single bucket. Same for "@node:
the &struct list_head to use as a loop cursor for each bucket".

So with these documentation changes applied, hash_for_each_possible
starts to make more sense, because it refers to entries that can
_possibly_ be a match (or not). Other options would be
hash_chain_for_each() or hash_bucket_for_each().

Thanks,

Mathieu

--
Mathieu Desnoyers
Operating System Efficiency R&D Consultant
EfficiOS Inc.
http://www.efficios.com

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel
 
Old 08-25-2012, 04:24 AM
Mathieu Desnoyers
 
Default hashtable: introduce a small and naive hashtable

* Tejun Heo (tj@kernel.org) wrote:
> Hello,
>
> On Sat, Aug 25, 2012 at 12:59:25AM +0200, Sasha Levin wrote:
> > Thats the thing, the amount of things of things you can do with a given bucket
> > is very limited. You can't add entries to any point besides the head (without
> > walking the entire list).
>
> Kinda my point. We already have all the hlist*() interface to deal
> with such cases. Having something which is evidently the trivial
> hlist hashtable and advertises as such in the interface can be
> helpful. I think we need that more than we need anything fancy.
>
> Heh, this is a debate about which one is less insignificant. I can
> see your point. I'd really like to hear what others think on this.
>
> Guys, do we want something which is evidently trivial hlist hashtable
> which can use hlist_*() API directly or do we want something better
> encapsulated?

My 2 cents, FWIW: I think this specific effort should target a trivially
understandable API and implementation, for use-cases where one would be
tempted to reimplement his own trivial hash table anyway. So here
exposing hlist internals, with which kernel developers are already
familiar, seems like a good approach in my opinion, because hiding stuff
behind new abstraction might make the target users go away.

Then, as we see the need, we can eventually merge a more elaborate hash
table with poneys and whatnot, but I would expect that the trivial hash
table implementation would still be useful. There are of course very
compelling reasons to use a more featureful hash table: automatic
resize, RT-aware updates, scalable updates, etc... but I see a purpose
for a trivial implementation. Its primary strong points being:

- it's trivially understandable, so anyone how want to be really sure
they won't end up debugging the hash table instead of their
work-in-progress code can have a full understanding of it,
- it has few dependencies, which makes it easier to understand and
easier to use in some contexts (e.g. early boot).

So I'm in favor of not overdoing the abstraction for this trivial hash
table, and honestly I would rather prefer that this trivial hash table
stays trivial. A more elaborate hash table should probably come as a
separate API.

Thanks,

Mathieu

--
Mathieu Desnoyers
Operating System Efficiency R&D Consultant
EfficiOS Inc.
http://www.efficios.com

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel
 
Old 08-28-2012, 10:11 AM
Mathieu Desnoyers
 
Default hashtable: introduce a small and naive hashtable

* Sasha Levin (levinsasha928@gmail.com) wrote:
> On 08/25/2012 06:24 AM, Mathieu Desnoyers wrote:
> > * Tejun Heo (tj@kernel.org) wrote:
> >> Hello,
> >>
> >> On Sat, Aug 25, 2012 at 12:59:25AM +0200, Sasha Levin wrote:
> >>> Thats the thing, the amount of things of things you can do with a given bucket
> >>> is very limited. You can't add entries to any point besides the head (without
> >>> walking the entire list).
> >>
> >> Kinda my point. We already have all the hlist*() interface to deal
> >> with such cases. Having something which is evidently the trivial
> >> hlist hashtable and advertises as such in the interface can be
> >> helpful. I think we need that more than we need anything fancy.
> >>
> >> Heh, this is a debate about which one is less insignificant. I can
> >> see your point. I'd really like to hear what others think on this.
> >>
> >> Guys, do we want something which is evidently trivial hlist hashtable
> >> which can use hlist_*() API directly or do we want something better
> >> encapsulated?
> >
> > My 2 cents, FWIW: I think this specific effort should target a trivially
> > understandable API and implementation, for use-cases where one would be
> > tempted to reimplement his own trivial hash table anyway. So here
> > exposing hlist internals, with which kernel developers are already
> > familiar, seems like a good approach in my opinion, because hiding stuff
> > behind new abstraction might make the target users go away.
> >
> > Then, as we see the need, we can eventually merge a more elaborate hash
> > table with poneys and whatnot, but I would expect that the trivial hash
> > table implementation would still be useful. There are of course very
> > compelling reasons to use a more featureful hash table: automatic
> > resize, RT-aware updates, scalable updates, etc... but I see a purpose
> > for a trivial implementation. Its primary strong points being:
> >
> > - it's trivially understandable, so anyone how want to be really sure
> > they won't end up debugging the hash table instead of their
> > work-in-progress code can have a full understanding of it,
> > - it has few dependencies, which makes it easier to understand and
> > easier to use in some contexts (e.g. early boot).
> >
> > So I'm in favor of not overdoing the abstraction for this trivial hash
> > table, and honestly I would rather prefer that this trivial hash table
> > stays trivial. A more elaborate hash table should probably come as a
> > separate API.
> >
> > Thanks,
> >
> > Mathieu
> >
>
> Alright, let's keep it simple then.
>
> I do want to keep the hash_for_each[rcu,safe] family though.

Just a thought: if the API offered by the simple hash table focus on
providing a mechanism to find the hash bucket to which belongs the hash
chain containing the key looked up, and then expects the user to use the
hlist API to iterate on the chain (with or without the hlist _rcu
variant), then it might seem consistent that a helper providing
iteration over the entire table would actually just provide iteration on
all buckets, and let the user call the hlist for each iterator for each
node within the bucket, e.g.:

struct hlist_head *head;
struct hlist_node *pos;

hash_for_each_bucket(ht, head) {
hlist_for_each(pos, head) {
...
}
}

That way you only have to provide one single macro
(hash_for_each_bucket), and rely on the already existing:

- hlist_for_each_entry
- hlist_for_each_safe
- hlist_for_each_entry_rcu
- hlist_for_each_safe_rcu
.....

and various flavors that can appear in the future without duplicating
this API. So you won't even have to create _rcu, _safe, nor _safe_rcu
versions of the hash_for_each_bucket macro.

Thoughts ?

Thanks,

Mathieu

--
Mathieu Desnoyers
Operating System Efficiency R&D Consultant
EfficiOS Inc.
http://www.efficios.com

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel
 
Old 08-28-2012, 11:56 AM
Mathieu Desnoyers
 
Default hashtable: introduce a small and naive hashtable

* Sasha Levin (levinsasha928@gmail.com) wrote:
> On 08/28/2012 12:11 PM, Mathieu Desnoyers wrote:
> > * Sasha Levin (levinsasha928@gmail.com) wrote:
> >> On 08/25/2012 06:24 AM, Mathieu Desnoyers wrote:
> >>> * Tejun Heo (tj@kernel.org) wrote:
> >>>> Hello,
> >>>>
> >>>> On Sat, Aug 25, 2012 at 12:59:25AM +0200, Sasha Levin wrote:
> >>>>> Thats the thing, the amount of things of things you can do with a given bucket
> >>>>> is very limited. You can't add entries to any point besides the head (without
> >>>>> walking the entire list).
> >>>>
> >>>> Kinda my point. We already have all the hlist*() interface to deal
> >>>> with such cases. Having something which is evidently the trivial
> >>>> hlist hashtable and advertises as such in the interface can be
> >>>> helpful. I think we need that more than we need anything fancy.
> >>>>
> >>>> Heh, this is a debate about which one is less insignificant. I can
> >>>> see your point. I'd really like to hear what others think on this.
> >>>>
> >>>> Guys, do we want something which is evidently trivial hlist hashtable
> >>>> which can use hlist_*() API directly or do we want something better
> >>>> encapsulated?
> >>>
> >>> My 2 cents, FWIW: I think this specific effort should target a trivially
> >>> understandable API and implementation, for use-cases where one would be
> >>> tempted to reimplement his own trivial hash table anyway. So here
> >>> exposing hlist internals, with which kernel developers are already
> >>> familiar, seems like a good approach in my opinion, because hiding stuff
> >>> behind new abstraction might make the target users go away.
> >>>
> >>> Then, as we see the need, we can eventually merge a more elaborate hash
> >>> table with poneys and whatnot, but I would expect that the trivial hash
> >>> table implementation would still be useful. There are of course very
> >>> compelling reasons to use a more featureful hash table: automatic
> >>> resize, RT-aware updates, scalable updates, etc... but I see a purpose
> >>> for a trivial implementation. Its primary strong points being:
> >>>
> >>> - it's trivially understandable, so anyone how want to be really sure
> >>> they won't end up debugging the hash table instead of their
> >>> work-in-progress code can have a full understanding of it,
> >>> - it has few dependencies, which makes it easier to understand and
> >>> easier to use in some contexts (e.g. early boot).
> >>>
> >>> So I'm in favor of not overdoing the abstraction for this trivial hash
> >>> table, and honestly I would rather prefer that this trivial hash table
> >>> stays trivial. A more elaborate hash table should probably come as a
> >>> separate API.
> >>>
> >>> Thanks,
> >>>
> >>> Mathieu
> >>>
> >>
> >> Alright, let's keep it simple then.
> >>
> >> I do want to keep the hash_for_each[rcu,safe] family though.
> >
> > Just a thought: if the API offered by the simple hash table focus on
> > providing a mechanism to find the hash bucket to which belongs the hash
> > chain containing the key looked up, and then expects the user to use the
> > hlist API to iterate on the chain (with or without the hlist _rcu
> > variant), then it might seem consistent that a helper providing
> > iteration over the entire table would actually just provide iteration on
> > all buckets, and let the user call the hlist for each iterator for each
> > node within the bucket, e.g.:
> >
> > struct hlist_head *head;
> > struct hlist_node *pos;
> >
> > hash_for_each_bucket(ht, head) {
> > hlist_for_each(pos, head) {
> > ...
> > }
> > }
> >
> > That way you only have to provide one single macro
> > (hash_for_each_bucket), and rely on the already existing:
> >
> > - hlist_for_each_entry
> > - hlist_for_each_safe
> > - hlist_for_each_entry_rcu
> > - hlist_for_each_safe_rcu
> > .....
> >
> > and various flavors that can appear in the future without duplicating
> > this API. So you won't even have to create _rcu, _safe, nor _safe_rcu
> > versions of the hash_for_each_bucket macro.
> >
> > Thoughts ?
>
> In my opinion, the downside here is that it'll require 2 function calls and 2
> levels of nesting for a simple hash iteration.

Those are macros, not functions. No function call is required. But I see
your point about nesting.

>
> hash_for_each_bucket() will always be followed by an iteration of that
> bucket, so splitting a hash_for_each() which does both into 2
> different functions which will almost always must be called in that
> given order sounds unintuitive to me.
>
> It's also just 3 different possible iterators:
>
> - hlist_for_each_entry
> - hlist_for_each_entry_safe
> - hlist_for_each_entry_rcu
>
> So I think that it's a good price to pay - 2 extra macro definitions
> in the header to save a macro call + nesting level in each place that
> uses a hashtable.

I must admin I don't care that much one way or another.

Thanks,

Mathieu

>
>
> Thanks,
> Sasha
>
> > Thanks,
> >
> > Mathieu
> >
>

--
Mathieu Desnoyers
Operating System Efficiency R&D Consultant
EfficiOS Inc.
http://www.efficios.com

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel
 
Old 08-28-2012, 11:00 PM
Mathieu Desnoyers
 
Default hashtable: introduce a small and naive hashtable

* Mathieu Desnoyers (mathieu.desnoyers@efficios.com) wrote:
> * Sasha Levin (levinsasha928@gmail.com) wrote:
> > On 08/28/2012 12:11 PM, Mathieu Desnoyers wrote:
> > > * Sasha Levin (levinsasha928@gmail.com) wrote:
> > >> On 08/25/2012 06:24 AM, Mathieu Desnoyers wrote:
> > >>> * Tejun Heo (tj@kernel.org) wrote:
> > >>>> Hello,
> > >>>>
> > >>>> On Sat, Aug 25, 2012 at 12:59:25AM +0200, Sasha Levin wrote:
> > >>>>> Thats the thing, the amount of things of things you can do with a given bucket
> > >>>>> is very limited. You can't add entries to any point besides the head (without
> > >>>>> walking the entire list).
> > >>>>
> > >>>> Kinda my point. We already have all the hlist*() interface to deal
> > >>>> with such cases. Having something which is evidently the trivial
> > >>>> hlist hashtable and advertises as such in the interface can be
> > >>>> helpful. I think we need that more than we need anything fancy.
> > >>>>
> > >>>> Heh, this is a debate about which one is less insignificant. I can
> > >>>> see your point. I'd really like to hear what others think on this.
> > >>>>
> > >>>> Guys, do we want something which is evidently trivial hlist hashtable
> > >>>> which can use hlist_*() API directly or do we want something better
> > >>>> encapsulated?
> > >>>
> > >>> My 2 cents, FWIW: I think this specific effort should target a trivially
> > >>> understandable API and implementation, for use-cases where one would be
> > >>> tempted to reimplement his own trivial hash table anyway. So here
> > >>> exposing hlist internals, with which kernel developers are already
> > >>> familiar, seems like a good approach in my opinion, because hiding stuff
> > >>> behind new abstraction might make the target users go away.
> > >>>
> > >>> Then, as we see the need, we can eventually merge a more elaborate hash
> > >>> table with poneys and whatnot, but I would expect that the trivial hash
> > >>> table implementation would still be useful. There are of course very
> > >>> compelling reasons to use a more featureful hash table: automatic
> > >>> resize, RT-aware updates, scalable updates, etc... but I see a purpose
> > >>> for a trivial implementation. Its primary strong points being:
> > >>>
> > >>> - it's trivially understandable, so anyone how want to be really sure
> > >>> they won't end up debugging the hash table instead of their
> > >>> work-in-progress code can have a full understanding of it,
> > >>> - it has few dependencies, which makes it easier to understand and
> > >>> easier to use in some contexts (e.g. early boot).
> > >>>
> > >>> So I'm in favor of not overdoing the abstraction for this trivial hash
> > >>> table, and honestly I would rather prefer that this trivial hash table
> > >>> stays trivial. A more elaborate hash table should probably come as a
> > >>> separate API.
> > >>>
> > >>> Thanks,
> > >>>
> > >>> Mathieu
> > >>>
> > >>
> > >> Alright, let's keep it simple then.
> > >>
> > >> I do want to keep the hash_for_each[rcu,safe] family though.
> > >
> > > Just a thought: if the API offered by the simple hash table focus on
> > > providing a mechanism to find the hash bucket to which belongs the hash
> > > chain containing the key looked up, and then expects the user to use the
> > > hlist API to iterate on the chain (with or without the hlist _rcu
> > > variant), then it might seem consistent that a helper providing
> > > iteration over the entire table would actually just provide iteration on
> > > all buckets, and let the user call the hlist for each iterator for each
> > > node within the bucket, e.g.:
> > >
> > > struct hlist_head *head;
> > > struct hlist_node *pos;
> > >
> > > hash_for_each_bucket(ht, head) {
> > > hlist_for_each(pos, head) {
> > > ...
> > > }
> > > }
> > >
> > > That way you only have to provide one single macro
> > > (hash_for_each_bucket), and rely on the already existing:
> > >
> > > - hlist_for_each_entry
> > > - hlist_for_each_safe
> > > - hlist_for_each_entry_rcu
> > > - hlist_for_each_safe_rcu
> > > .....
> > >
> > > and various flavors that can appear in the future without duplicating
> > > this API. So you won't even have to create _rcu, _safe, nor _safe_rcu
> > > versions of the hash_for_each_bucket macro.
> > >
> > > Thoughts ?
> >
> > In my opinion, the downside here is that it'll require 2 function calls and 2
> > levels of nesting for a simple hash iteration.
>
> Those are macros, not functions. No function call is required. But I see
> your point about nesting.
>
> >
> > hash_for_each_bucket() will always be followed by an iteration of that
> > bucket, so splitting a hash_for_each() which does both into 2
> > different functions which will almost always must be called in that
> > given order sounds unintuitive to me.
> >
> > It's also just 3 different possible iterators:
> >
> > - hlist_for_each_entry
> > - hlist_for_each_entry_safe
> > - hlist_for_each_entry_rcu
> >
> > So I think that it's a good price to pay - 2 extra macro definitions
> > in the header to save a macro call + nesting level in each place that
> > uses a hashtable.
>
> I must admin I don't care that much one way or another.

Looking again at:

+#define hash_for_each_size(name, bits, bkt, node, obj, member)
+ for (bkt = 0; bkt < HASH_SIZE(bits); bkt++)
+ hlist_for_each_entry(obj, node, &name[bkt], member)

you will notice that a "break" or "continue" in the inner loop will not
affect the outer loop, which is certainly not what the programmer would
expect!

I advise strongly against creating such error-prone construct.

Thanks,

Mathieu



>
> Thanks,
>
> Mathieu
>
> >
> >
> > Thanks,
> > Sasha
> >
> > > Thanks,
> > >
> > > Mathieu
> > >
> >
>
> --
> Mathieu Desnoyers
> Operating System Efficiency R&D Consultant
> EfficiOS Inc.
> http://www.efficios.com

--
Mathieu Desnoyers
Operating System Efficiency R&D Consultant
EfficiOS Inc.
http://www.efficios.com

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel
 
Old 09-04-2012, 04:32 PM
Mathieu Desnoyers
 
Default hashtable: introduce a small and naive hashtable

* Steven Rostedt (rostedt@goodmis.org) wrote:
> On Tue, 2012-08-28 at 19:00 -0400, Mathieu Desnoyers wrote:
>
> > Looking again at:
> >
> > +#define hash_for_each_size(name, bits, bkt, node, obj, member)
> > + for (bkt = 0; bkt < HASH_SIZE(bits); bkt++)
> > + hlist_for_each_entry(obj, node, &name[bkt], member)
> >
> > you will notice that a "break" or "continue" in the inner loop will not
> > affect the outer loop, which is certainly not what the programmer would
> > expect!
> >
> > I advise strongly against creating such error-prone construct.
> >
>
> A few existing loop macros do this. But they require a do { } while ()
> approach, and all have a comment.
>
> It's used by do_each_thread() in sched.h

Yes. It's worth noting that it is a do_each_thread() /
while_each_thread() pair.


> and ftrace does this as well.
> Look at kernel/trace/ftrace.c at do_for_each_ftrace_rec().

Same here.

>
> Yes it breaks 'break' but it does not break 'continue' as it would just
> go to the next item that would have been found (like a normal for
> would).

Good point.

So would changing hash_for_each_size() to a

do_each_hash_size()/while_each_hash_size() make it clearer that this
contains a double-loop ? (along with an appropriate comment about
break).

Thanks,

Mathieu

>
> -- Steve
>
>

--
Mathieu Desnoyers
Operating System Efficiency R&D Consultant
EfficiOS Inc.
http://www.efficios.com

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel
 
Old 09-04-2012, 05:01 PM
Mathieu Desnoyers
 
Default hashtable: introduce a small and naive hashtable

* Pedro Alves (palves@redhat.com) wrote:
> On 09/04/2012 05:30 PM, Pedro Alves wrote:
> > On 09/04/2012 04:35 PM, Steven Rostedt wrote:
> >> On Tue, 2012-08-28 at 19:00 -0400, Mathieu Desnoyers wrote:
> >>
> >>> Looking again at:
> >>>
> >>> +#define hash_for_each_size(name, bits, bkt, node, obj, member)
> >>> + for (bkt = 0; bkt < HASH_SIZE(bits); bkt++)
> >>> + hlist_for_each_entry(obj, node, &name[bkt], member)
> >>>
> >>> you will notice that a "break" or "continue" in the inner loop will not
> >>> affect the outer loop, which is certainly not what the programmer would
> >>> expect!
> >>>
> >>> I advise strongly against creating such error-prone construct.
> >>>
> >>
> >> A few existing loop macros do this. But they require a do { } while ()
> >> approach, and all have a comment.
> >>
> >> It's used by do_each_thread() in sched.h and ftrace does this as well.
> >> Look at kernel/trace/ftrace.c at do_for_each_ftrace_rec().
> >>
> >> Yes it breaks 'break' but it does not break 'continue' as it would just
> >> go to the next item that would have been found (like a normal for
> >> would).
> >
> > /*
> > * This is a double for. Do not use 'break' to break out of the loop,
> > * you must use a goto.
> > */
> > #define do_for_each_ftrace_rec(pg, rec)
> > for (pg = ftrace_pages_start; pg; pg = pg->next) {
> > int _____i;
> > for (_____i = 0; _____i < pg->index; _____i++) {
> > rec = &pg->records[_____i];
> >
> >
> >
> > You can make 'break' also work as expected if you can embed a little knowledge
> > of the inner loop's condition in the outer loop's condition. Sometimes it's
> > trivial, most often when the inner loop's iterator is a pointer that goes
> > NULL at the end, but other times not so much. Something like (completely untested):
> >
> > #define do_for_each_ftrace_rec(pg, rec)
> > for (pg = ftrace_pages_start, rec = &pg->records[pg->index];
> > pg && rec == &pg->records[pg->index];
> > pg = pg->next) {
> > int _____i;
> > for (_____i = 0; _____i < pg->index; _____i++) {
> > rec = &pg->records[_____i];
> >
> >
> > (other variants possible)
> >
> > IOW, the outer loop only iterates if the inner loop completes. If there's
> > a break in the inner loop, then the outer loop breaks too. Of course, it
> > all depends on whether the generated code looks sane or hideous, if
> > the uses of the macro care for it over bug avoidance.
> >
>
> BTW, you can also go a step further and remove the need to close with double }},
> with something like:
>
> #define do_for_each_ftrace_rec(pg, rec)
> for (pg = ftrace_pages_start, rec = &pg->records[pg->index];
> pg && rec == &pg->records[pg->index];
> pg = pg->next)
> for (rec = pg->records; rec < &pg->records[pg->index]; rec++)

Maybe in some cases there might be ways to combine the two loops into
one ? I'm not seeing exactly how to do it for this one, but it should
not be impossible. If the inner loop condition can be moved to the outer
loop, and if we use (blah ? loop1_conf : loop2_cond) to test for
different conditions depending on the context, and do the same for the
3rd argument of the for() loop. The details elude me for now though, so
maybe it's complete non-sense

It might not be that useful for do_for_each_ftrace_rec, but if we can do
it for the hash table iterator, it might be worth it.

Thanks,

Mathieu


--
Mathieu Desnoyers
Operating System Efficiency R&D Consultant
EfficiOS Inc.
http://www.efficios.com

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel
 
Old 09-06-2012, 02:33 PM
Mathieu Desnoyers
 
Default hashtable: introduce a small and naive hashtable

* Sasha Levin (levinsasha928@gmail.com) wrote:
> On 09/04/2012 07:01 PM, Mathieu Desnoyers wrote:
> >> #define do_for_each_ftrace_rec(pg, rec)
> >> > for (pg = ftrace_pages_start, rec = &pg->records[pg->index];
> >> > pg && rec == &pg->records[pg->index];
> >> > pg = pg->next)
> >> > for (rec = pg->records; rec < &pg->records[pg->index]; rec++)
> > Maybe in some cases there might be ways to combine the two loops into
> > one ? I'm not seeing exactly how to do it for this one, but it should
> > not be impossible. If the inner loop condition can be moved to the outer
> > loop, and if we use (blah ? loop1_conf : loop2_cond) to test for
> > different conditions depending on the context, and do the same for the
> > 3rd argument of the for() loop. The details elude me for now though, so
> > maybe it's complete non-sense
> >
> > It might not be that useful for do_for_each_ftrace_rec, but if we can do
> > it for the hash table iterator, it might be worth it.
>
> So I think that for the hash iterator it might actually be simpler.
>
> My solution to making 'break' work in the iterator is:
>
> for (bkt = 0, node = NULL; bkt < HASH_SIZE(name) && node == NULL; bkt++)
> hlist_for_each_entry(obj, node, &name[bkt], member)
>
> We initialize our node loop cursor with NULL in the external loop, and the
> external loop will have a new condition to loop while that cursor is NULL.
>
> My logic is that we can only 'break' when we are iterating over an object in the
> internal loop. If we're iterating over an object in that loop then 'node != NULL'.
>
> This way, if we broke from within the internal loop, the external loop will see
> node as not NULL, and so it will stop looping itself. On the other hand, if the
> internal loop has actually ended, then node will be NULL, and the outer loop
> will keep running.
>
> Is there anything I've missed?

This sounds good. Unless I'm missing something too.

Thanks!

Mathieu

>
>
> Thanks,
> Sasha

--
Mathieu Desnoyers
Operating System Efficiency R&D Consultant
EfficiOS Inc.
http://www.efficios.com

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel
 
Old 09-06-2012, 04:50 PM
Mathieu Desnoyers
 
Default hashtable: introduce a small and naive hashtable

* Sasha Levin (levinsasha928@gmail.com) wrote:
> On 09/06/2012 06:00 PM, Steven Rostedt wrote:
> >> > I think that that code doesn't make sense. The users of hlist_for_each_* aren't
> >> > supposed to be changing the loop cursor.
> > I totally agree. Modifying the 'node' pointer is just asking for issues.
> > Yes that is error prone, but not due to the double loop. It's due to the
> > modifying of the node pointer that is used internally by the loop
> > counter. Don't do that :-)
>
> While we're on this subject, I haven't actually seen hlist_for_each_entry() code
> that even *touches* 'pos'.
>
> Will people yell at me loudly if I change the prototype of those macros to be:
>
> hlist_for_each_entry(tpos, head, member)
>
> (Dropping the 'pos' parameter), and updating anything that calls those macros to
> drop it as well?

I think the intent there is to keep hlist macros and list macros
slightly in sync. Given those are vastly used, I'm not sure you want to
touch them. But hey, that's just my 2 cents.

Thanks,

Mathieu

>
>
> Thanks,
> Sasha

--
Mathieu Desnoyers
Operating System Efficiency R&D Consultant
EfficiOS Inc.
http://www.efficios.com

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel
 

Thread Tools




All times are GMT. The time now is 02:17 AM.

VBulletin, Copyright ©2000 - 2014, Jelsoft Enterprises Ltd.
Content Relevant URLs by vBSEO ©2007, Crawlability, Inc.
Copyright 2007 - 2008, www.linux-archive.org