+ - 0:00:00
Notes for current slide
Notes for next slide

PEER Advanced Field School 2023

Day 4 - Network Models

Eric Brewe
Professor of Physics at Drexel University

15 June 2023, last update: 2023-06-23

1 / 18

Welcome Back!

Our project is about building and analyzing the Workshop network

We will continue to use the WorkshopNetwork.rds data file.

library(tidyverse) #tools for cleaning data
library(igraph) #package for doing network analysis
library(tidygraph) #tools for doing tidy networks
library(here) #tools for project-based workflow
library(ggraph) #plotting tools for networks
library(boot) #to do resampling
2 / 18

Let's load our data

gr <- readRDS(here("data", "WorkshopNetwork.rds"))
3 / 18

Let's plot it quickly

And get some important features of the graph.

Whare are distinctive features of the network?

```r
GrLayout <- create_layout(gr,
layout = "kk")
ggraph(GrLayout) +
geom_edge_link() +
geom_node_point() +
theme(legend.position="bottom")
```

gr
## # A tbl_graph: 60 nodes and 101 edges
## #
## # A directed multigraph with 8 components
## #
## # A tibble: 60 × 4
## name AMPM Dessert Pages
## <chr> <chr> <chr> <dbl>
## 1 5106 morning person. Brownies 350
## 2 6633 morning person. I don't care for dessert. 12
## 3 7599 night owl. Ice Cream 300
## 4 4425 morning person. I don't care for dessert. 0
## 5 2495 morning person. Brownies 264
## 6 6355 night owl. Ice Cream 4
## # ℹ 54 more rows
## #
## # A tibble: 101 × 2
## from to
## <int> <int>
## 1 1 1
## 2 2 35
## 3 3 14
## # ℹ 98 more rows

Note, there are 60 nodes and 101 edges

4 / 18

Orient

5 / 18

Let's decide if our network is unique

What is the big problem with the network that we have collected?

6 / 18

Let's decide if our network is unique

What is the big problem with the network that we have collected?

  1. We have one network.
  2. There is no variance in the metrics.
  3. What is the null hypothesis/model?
6 / 18

Let's talk about camps

This is a place where there are sort of two camps of network analysis

Statistical Camp

  • Variance: Permutation/resampling techniques such as bootstrap.
  • Null hypothesis: Exponential random graph models (ERGMs) Or rewired network

Graph Theoretic Camp

  • Null Model Theory driven models
    • Random Graph (Erdos-Renyi)
    • Small World (Watts-Strogatz)
    • Preferential Attachment (Barabasi-Albert)
  • Variance Network simulation
7 / 18

Let's compare two networks

GrLayout <- create_layout(gr,
layout = "kk")
ggraph(GrLayout) +
geom_edge_link() +
geom_node_point() +
theme(legend.position="bottom")

er_gr <- play_erdos_renyi(n = 60, m = 101, loops = FALSE)
GrLayoutER <- create_layout(er_gr,
layout = "kk")
ggraph(GrLayoutER) +
geom_edge_link() +
geom_node_point() +
theme(legend.position="bottom")

8 / 18

Let's talk about how you might compare

What are some ways you can think of to compare these two graphs?

9 / 18

Let's compare based on metrics

Why don't we try density

edge_density(gr)
## [1] 0.02853107
edge_density(er_gr)
## [1] 0.02853107

Well that isn't any fun!

10 / 18

Let's compare based on metrics

Here is diameter, note that with tidygraph you have to use the with_graph() function, and specify the graph and function graph_diameter()

with_graph(gr, graph_diameter())
## [1] 6
with_graph(er_gr, graph_diameter())
## [1] 12

Ok, so the numbers are different...does that mean the graphs are different?

11 / 18

Let's try this 1000 times

#Start by setting up a vector to hold our results.
diameter_results <- c()
for(i in 1:1000) {
tmp <- play_erdos_renyi(n=60, m = 101)
diameter_results[i] = with_graph(tmp, graph_diameter())
}
#Convert to dataframe
diameter_results_df <- tibble(diam = diameter_results)
#and plot our results
ggplot(diameter_results_df, aes(diam)) +
geom_bar() +
geom_vline(xintercept = with_graph(gr, graph_diameter()),
color = 'red',
size = 2)

12 / 18

Let's get a confidcence interval

If we want to estimate the confidence with which we think our measured network is different on any metrics, we can look for the percentiles.

Let's see if the measured diameter falls within the 95% of the simulated network....

mean(diameter_results_df$diam)
## [1] 13.608
sd(diameter_results_df$diam)
## [1] 2.450376
quantile(diameter_results_df$diam, probs = c(0.025, 0.975))
## 2.5% 97.5%
## 10 19

Since the measured diameter of 6 is outside of this range, we can say pretty confidently that the measured network is not a random network!

13 / 18

Let's summarize

  1. We measured one network
  2. To compare we generated a network (in this case it was random)
  3. We replicated 1000 times
  4. We checked whether metrics from the measured network falls within some confidence range.
  5. We can say the measured network is different than random.
14 / 18

Let's assign attributes

In order to compare homophily, we need our ER_gr to have the same attributes.

There is a file that has a random graph with the attributes assigned to it.

ER_gr_attr <- readRDS(here("data", "RandomGraph.rds"))
## # A tbl_graph: 58 nodes and 101 edges
## #
## # A directed simple graph with 1 component
## #
## # A tibble: 58 × 3
## name AMPM Dessert
## <chr> <chr> <chr>
## 1 5106 morning person. Brownies
## 2 6633 morning person. I don't care for dessert.
## 3 7599 night owl. Ice Cream
## 4 4425 morning person. I don't care for dessert.
## 5 2495 morning person. Brownies
## 6 6355 night owl. Ice Cream
## # ℹ 52 more rows
## #
## # A tibble: 101 × 2
## from to
## <int> <int>
## 1 1 12
## 2 1 25
## 3 1 33
## # ℹ 98 more rows

Now you get to try!

  • Choose a network simulator (think about why you chose it)
  • Decide on a metric (better yet - create a function to look at many at once).
  • Simulate networks -- Measure chosen metrics -- Store metrics in data frame
15 / 18

Let's reflect

We did it!

  1. We can import data into R
  2. We can manipulate these data
  3. We can create a network
  4. We can plot a network
  5. We can calculate a number of centrality measures
  6. We can use these in plotting networks
  7. We can use these in testing hypotheses
  8. We can calculate a number of whole graph metrics
  9. We can use these in comparing networks
16 / 18

Let's note what we missed

  1. We didn't really touch Base R
  2. We didn't deal with ERGMs.
17 / 18

Let's be thankful

Thanks to PEER

Thanks to all of you

👏

18 / 18

Welcome Back!

Our project is about building and analyzing the Workshop network

We will continue to use the WorkshopNetwork.rds data file.

library(tidyverse) #tools for cleaning data
library(igraph) #package for doing network analysis
library(tidygraph) #tools for doing tidy networks
library(here) #tools for project-based workflow
library(ggraph) #plotting tools for networks
library(boot) #to do resampling
2 / 18
Paused

Help

Keyboard shortcuts

, , Pg Up, k Go to previous slide
, , Pg Dn, Space, j Go to next slide
Home Go to first slide
End Go to last slide
Number + Return Go to specific slide
b / m / f Toggle blackout / mirrored / fullscreen mode
c Clone slideshow
p Toggle presenter mode
t Restart the presentation timer
?, h Toggle this help
Esc Back to slideshow