So we will not need a new project folder, we will continue to use RForSNA, which you should have set up in WS1.
This time we will add a new .Rmd file, give it a title like "network_properties.Rmd"
So navigate to your RForSNA file and create a new .Rmd file.
Then, add a code chunk to load the libraries necessary and run this chunk.
library(tidyverse) #tools for cleaning data library(igraph) #package for doing network analysislibrary(tidygraph) #tools for doing tidy networkslibrary(here) #tools for project-based workflowlibrary(ggraph) #plotting tools for networkslibrary(boot) #to do resampling
This workshop assumes you did the work from week 1 and you know how to install libraries, read csv files into data, and to manipulate data in R.
The benefit of having done the work previously is that we don't have to re-do it (unless we want to change something)
So I took the network at the end of workshop #1 and saved it as a .rds file. I could have saved it as a pair of csv files and rebuilt the network, but if I save as rds it will be ready to go!
To load an rds file the syntax is a little different:
gr <- readRDS(here("data", "WorkshopNetwork.rds"))
While we are at it we might as well set a layout.
```rGrLayout <- create_layout(gr, layout = "kk")ggraph(GrLayout) + geom_edge_link() + geom_node_point() + theme(legend.position="bottom")```
These are fairly easy to calculate
These require a bit more work
This is a list of many common graph metrics used in tidygraph https://rdrr.io/cran/tidygraph/man/graph_measures.html
Here is density
edge_density(gr)
## [1] 0.02853107
Here is diameter, note that with tidygraph you have to use the with_graph() function, and specify the graph and function graph_diameter()
with_graph(gr, graph_diameter())
## [1] 6
Here is density
edge_density(gr)
## [1] 0.02853107
Here is diameter, note that with tidygraph you have to use the with_graph() function, and specify the graph and function graph_diameter()
with_graph(gr, graph_diameter())
## [1] 6
So?
with_graph(gr, graph_reciprocity()) #Reciprocity
## [1] 0.2142857
transitivity(gr) #Transitivity
## [1] 0.2669492
with_graph(gr, graph_mean_dist()) #Average Distance
## [1] 1.88125
GrFunction <- function(gr){ Giant = max(clusters(gr)$csize) AMPMHomophily = with_graph(gr, graph_assortativity(AMPM)) DessertHomophily = with_graph(gr, graph_assortativity(Dessert)) AveDeg = gr %>% activate(nodes) %>% mutate(Deg = centrality_degree( mode = 'total') ) %>% select(Deg) %>% as_tibble() %>% summarise(AveDeg = mean(Deg)) df = tibble(Giant, AMPMHomophily, DessertHomophily, AveDeg ) return(df)}
GrFunction(gr)
## # A tibble: 1 x 4## Giant AMPMHomophily DessertHomophily AveDeg## <dbl> <dbl> <dbl> <dbl>## 1 30 -0.723 0.0468 3.37
GrFunction(gr)
## # A tibble: 1 x 4## Giant AMPMHomophily DessertHomophily AveDeg## <dbl> <dbl> <dbl> <dbl>## 1 30 -0.723 0.0468 3.37
Giant = 30, there are 30 people connected in the largest component.
AMPM Homophily, Since this is negative, there is a propensity for people to associate with others, (e.g., A morning person is more likely connected to a night owl)
Dessert Homophily, this is pretty close to zero, so there doesn't seem to be an association.
AveDeg, the average person has 3.37 incoming or outgoing edges
GrFunction(gr)
## # A tibble: 1 x 4## Giant AMPMHomophily DessertHomophily AveDeg## <dbl> <dbl> <dbl> <dbl>## 1 30 -0.723 0.0468 3.37
Giant = 30, there are 30 people connected in the largest component.
AMPM Homophily, Since this is negative, there is a propensity for people to associate with others, (e.g., A morning person is more likely connected to a night owl)
Dessert Homophily, this is pretty close to zero, so there doesn't seem to be an association.
AveDeg, the average person has 3.37 incoming or outgoing edges
First, we need to tell the simulation how many nodes and edges to include.
N_Nodes = with_graph(gr, graph_order())N_Edges = with_graph(gr, graph_size())
First, we need to tell the simulation how many nodes and edges to include.
N_Nodes = with_graph(gr, graph_order())N_Edges = with_graph(gr, graph_size())
set.seed(522)ER_gr <- play_erdos_renyi(n = N_Nodes, m = N_Edges)
ER_gr %>% ggraph(layout = "kk") + geom_edge_link() + geom_node_point() + theme(legend.position="bottom")
ggraph(GrLayout) + geom_edge_link() + geom_node_point()
ER_gr %>% ggraph(layout = "kk") + geom_edge_link() + geom_node_point()
I emailed you all an additional file that has a random graph with attributes (because it was a pain to add those in.) So now, we need to add these in...
ER_gr_attr <- readRDS(here("data", "RandomGraph.rds"))
## # A tbl_graph: 58 nodes and 101 edges## ### # A directed simple graph with 1 component## ### # Node Data: 58 x 3 (active)## name AMPM Dessert ## <chr> <chr> <chr> ## 1 5106 morning person. Brownies ## 2 6633 morning person. I don't care for dessert.## 3 7599 night owl. Ice Cream ## 4 4425 morning person. I don't care for dessert.## 5 2495 morning person. Brownies ## 6 6355 night owl. Ice Cream ## # … with 52 more rows## ### # Edge Data: 101 x 2## from to## <int> <int>## 1 1 12## 2 1 25## 3 1 33## # … with 98 more rows
GrFunction(gr)
## # A tibble: 1 x 4## Giant AMPMHomophily DessertHomophily AveDeg## <dbl> <dbl> <dbl> <dbl>## 1 30 -0.723 0.0468 3.37
GrFunction(ER_gr_attr)
## # A tibble: 1 x 4## Giant AMPMHomophily DessertHomophily AveDeg## <dbl> <dbl> <dbl> <dbl>## 1 58 -0.684 -0.00420 3.48
In order to do this, lets reduce the number of graph metrics we want to calculate.
SmGrFunction <- function(gr){ Giant = max(clusters(gr)$csize) Recip = with_graph(gr, graph_reciprocity()) Trans = transitivity(gr) Dist = with_graph(gr, graph_mean_dist()) Dia = with_graph(gr, graph_diameter()) df = tibble(Giant, Recip, Trans, Dist, Dia ) return(df)}
for (i in 1:100) { test_gr = play_erdos_renyi(n = N_Nodes, m = N_Edges) if(i==1) {df <- SmGrFunction(test_gr)} else {df <- bind_rows(df, SmGrFunction(test_gr))} }
head(df)
## # A tibble: 6 x 5## Giant Recip Trans Dist Dia## <dbl> <dbl> <dbl> <dbl> <dbl>## 1 56 0.0396 0.0561 5.14 14## 2 58 0.0198 0.0464 6.43 23## 3 55 0.0396 0.0636 5.04 12## 4 60 0.0198 0.135 4.87 10## 5 59 0.0594 0.0677 5.41 16## 6 57 0 0.0752 4.69 14
And lets average these out.
df %>% summarise(across(everything(), mean))
## # A tibble: 1 x 5## Giant Recip Trans Dist Dia## <dbl> <dbl> <dbl> <dbl> <dbl>## 1 58.0 0.0297 0.0545 5.16 13.7
And standard deviations.
df %>% summarise(across(everything(), sd))
## # A tibble: 1 x 5## Giant Recip Trans Dist Dia## <dbl> <dbl> <dbl> <dbl> <dbl>## 1 1.46 0.0230 0.0239 0.602 2.49
df %>% summarise(across(everything(), mean))
## # A tibble: 1 x 5## Giant Recip Trans Dist Dia## <dbl> <dbl> <dbl> <dbl> <dbl>## 1 58.0 0.0297 0.0545 5.16 13.7
SmGrFunction(gr)
## # A tibble: 1 x 5## Giant Recip Trans Dist Dia## <dbl> <dbl> <dbl> <dbl> <dbl>## 1 30 0.214 0.267 1.88 6
Let's see if the measured average distance falls within the 95% of the simulated network....
quantile(df$Dist, probs = c(0.025, 0.975))
## 2.5% 97.5% ## 3.889960 6.243622
Since the measured average distance is outside of this range, we can say that the measured network is not a random network!
We did it!
Thanks to Dali Ma
Thanks to SSRC
Thanks to all of you
👏
So we will not need a new project folder, we will continue to use RForSNA, which you should have set up in WS1.
This time we will add a new .Rmd file, give it a title like "network_properties.Rmd"
So navigate to your RForSNA file and create a new .Rmd file.
Then, add a code chunk to load the libraries necessary and run this chunk.
library(tidyverse) #tools for cleaning data library(igraph) #package for doing network analysislibrary(tidygraph) #tools for doing tidy networkslibrary(here) #tools for project-based workflowlibrary(ggraph) #plotting tools for networkslibrary(boot) #to do resampling
Keyboard shortcuts
↑, ←, Pg Up, k | Go to previous slide |
↓, →, Pg Dn, Space, j | Go to next slide |
Home | Go to first slide |
End | Go to last slide |
Number + Return | Go to specific slide |
b / m / f | Toggle blackout / mirrored / fullscreen mode |
c | Clone slideshow |
p | Toggle presenter mode |
t | Restart the presentation timer |
?, h | Toggle this help |
Esc | Back to slideshow |