Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Is there a concurrent_hash_set container ? #1584

Open
fdiedler opened this issue Dec 12, 2024 · 3 comments
Open

Is there a concurrent_hash_set container ? #1584

fdiedler opened this issue Dec 12, 2024 · 3 comments
Labels

Comments

@fdiedler
Copy link

Hi,

I see there is a concurrent_hash_map container but no concurrent_hash_set container in TBB. Is it possible to have concurrent_hash_set container ?

The concurrent_unordered_set seems to have poor performance in my application in the clear() operation.

Thanks,

@pavelkumbrasev
Copy link
Contributor

We do not have concurrent_hash_set in oneTBB but you still can use concurrent_hash_map and just ignore the value (use something like bool to save some memory).
@kboyarinov what do you think is it possible to extend concurrent_hash_map to support the set behavior?

@kboyarinov
Copy link
Contributor

@pavelkumbrasev @fdiedler
I think in theory it is possible to add a hash set container on top of concurrent_hash_map engine. Could you please explain your use case for such case as well as the performance issues with concurrent_unordered_set::clear to proper understand the motivation for such an extension?
Thanks in advance

@fdiedler
Copy link
Author

fdiedler commented Jan 8, 2025

@kboyarinov @pavelkumbrasev

Let's take this simple example (I omitted the includes for readability) :

struct Foo
{
    std::bitset<512> field1;
    int field2;

    struct Equality {
        inline bool operator()(const Foo& a, const Foo& b) const
        {
            return (a.field1 == b.field1 && a.field2 == b.field2);
        };
    };

    struct Hash {
        inline size_t operator()(const Foo& n) const
        {
            return (n.field2 ^ std::hash<std::bitset<512>>()(n.field1));
        };
    };
};

// Example from https://www.intel.com/content/www/us/en/docs/onetbb/developer-guide-api-reference/2021-12/concurrent-hash-map.html
struct MyHashCompare {
    static size_t hash( const Foo& n ) {
        return (n.field2 ^ std::hash<std::bitset<512>>()(n.field1));
    }
    static bool equal( const Foo& a, const Foo& b ) {
        return (a.field1 == b.field1 && a.field2 == b.field2);
    }
};

int main()
{
    using duration_t = std::chrono::duration<int, std::milli>;
    std::chrono::time_point<std::chrono::high_resolution_clock> start, end;
    //tbb::concurrent_unordered_set<Foo, Foo::Hash, Foo::Equality> unorderedSet;
    tbb::concurrent_hash_map<Foo, Foo, MyHashCompare> unorderedSet;

    start = std::chrono::high_resolution_clock::now();
    for (size_t i=0; i<10000000; ++i)
    {
        Foo tmp;
        tmp.field1[i % 512] = 1;
        tmp.field2 = i;
        unorderedSet.insert({tmp, tmp}); // for concurrent_unordered_set : unorderedSet.insert(tmp);
    }
    end = std::chrono::high_resolution_clock::now();

    std::cout << "Size = " << unorderedSet.size() << " add in " << std::chrono::duration_cast<duration_t>(end - start).count() << " ms " << std::endl;

    start = std::chrono::high_resolution_clock::now();
    unorderedSet.clear();
    end = std::chrono::high_resolution_clock::now();

    std::cout << "Time to clear = " << std::chrono::duration_cast<duration_t>(end - start).count() << " ms";

    return 0;
}

The output on my computer (Intel(R) Core(TM) i7-9750H CPU / Windows 10) :

Size = 10000000 add in 4661 ms
Time to clear = 1942 ms

If I use a tbb::concurrent_unordered_set instead of the tbb::concurrent_hash_map

Size = 10000000 add in 6649 ms
Time to clear = 2967 ms

The Hash map clear function is really faster (note that adding elements is also faster with Hash map). But I don't need the {Key -> Value} and that is why it would benice to have a tbb;;concurrent_hash_set with better performance than tbb::concurrent_unordered_set.

Thanks,

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants