Skip to content

Deadlock when creating large graph #252

@ldhulipala

Description

@ldhulipala

Hi all,

I was trying to create a large graph (twitter, ~41M vertices and ~2B undirected edges), but ran into some trouble. The input file is stored in the DIMACS format. I'm using the dimacs loader which I specify as follows:

env STINGER_MAX_MEMSIZE=300G ./bin/stinger_server -i twitter_sym.gr -t d

After adding print statements, I noticed that the code is stalling in the parallel loop that calls insert_edge(..) over all edges read from the input file. A graph about 1/10th the size could be loaded in about 2 mins, so I would expect the load to take about 20-30 mins, but even after letting it run for several hours it still seems to be stuck in this loop. Twitter has a pretty skewed degree distribution, so many threads could be trying to insert into the same vertex's list simultaneously. Could it be that this is causing the insertion code to deadlock somehow?

I'm currently running the loop without the parallel-for to see if it will complete successfully when run sequentially. That would hint that there is some deadlock issue with insert_edge(..) when loading this graph. I will report back with the result tomorrow.

Thanks,
Laxman

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions