Replace all NA values for variable with one row equal to 0

Slightly difficult to phrase, as far as I saw none of the similar questions answered my problem.

I have a data.frame such as:

df1 <- data.frame(id = rep(c("a", "b"), each = 4),

                  val = c(NA, NA, NA, NA, 1, 2, 2, 3))



df1



  id val

1  a  NA

2  a  NA

3  a  NA

4  a  NA

5  b   1

6  b   2

7  b   2

8  b   3

and I want to get rid of all the NA values (easy enough using e.g. filter() ) but make sure that if this removes all of one id value (in this case it removes every instance of "a") that one extra row is inserted of (e.g.) a = 0

so that:

obviously easy enough to do this in a roundabout way but I was wondering if there's a tidy/elegant way to do this. I thought tidyr::complete() might help but not entirely sure how to apply it to a case like this

I don't care about the order of the rows

Cheers!

asked 2 hours ago

Robert Hickman

13519

So you want to add rows with 0 only if all the values for particular id is 0?
– Ronak Shah
2 hours ago

only if they're all NA for a particular id
– Robert Hickman
2 hours ago

1

@RobertHickman There seems to be some confusion about your desired output. Could you update your question with the expected output based on this df1 <- data.frame(id = rep(c("a", "b","c"), each = 4), val = c(NA, NA, NA, NA, 1, 2, 2, 3,NA,2,NA,3)) ? Thanks to @VivekKalyanarangan for the data.
– markus
1 hour ago

add a comment |

Slightly difficult to phrase, as far as I saw none of the similar questions answered my problem.

I have a data.frame such as:

df1 <- data.frame(id = rep(c("a", "b"), each = 4),

                  val = c(NA, NA, NA, NA, 1, 2, 2, 3))



df1



  id val

1  a  NA

2  a  NA

3  a  NA

4  a  NA

5  b   1

6  b   2

7  b   2

8  b   3

so that:

I don't care about the order of the rows

Cheers!

asked 2 hours ago

Robert Hickman

13519

So you want to add rows with 0 only if all the values for particular id is 0?
– Ronak Shah
2 hours ago

only if they're all NA for a particular id
– Robert Hickman
2 hours ago

1

@RobertHickman There seems to be some confusion about your desired output. Could you update your question with the expected output based on this df1 <- data.frame(id = rep(c("a", "b","c"), each = 4), val = c(NA, NA, NA, NA, 1, 2, 2, 3,NA,2,NA,3)) ? Thanks to @VivekKalyanarangan for the data.
– markus
1 hour ago

add a comment |

Slightly difficult to phrase, as far as I saw none of the similar questions answered my problem.

I have a data.frame such as:

df1 <- data.frame(id = rep(c("a", "b"), each = 4),

                  val = c(NA, NA, NA, NA, 1, 2, 2, 3))



df1



  id val

1  a  NA

2  a  NA

3  a  NA

4  a  NA

5  b   1

6  b   2

7  b   2

8  b   3

so that:

I don't care about the order of the rows

Cheers!

asked 2 hours ago

Robert Hickman

13519

Slightly difficult to phrase, as far as I saw none of the similar questions answered my problem.

I have a data.frame such as:

df1 <- data.frame(id = rep(c("a", "b"), each = 4),

                  val = c(NA, NA, NA, NA, 1, 2, 2, 3))



df1



  id val

1  a  NA

2  a  NA

3  a  NA

4  a  NA

5  b   1

6  b   2

7  b   2

8  b   3

so that:

I don't care about the order of the rows

Cheers!

r na complete

asked 2 hours ago

Robert Hickman

13519

asked 2 hours ago

Robert Hickman

13519

asked 2 hours ago

Robert Hickman

13519

asked 2 hours ago

Robert Hickman

13519

asked 2 hours ago

Robert Hickman

13519

So you want to add rows with 0 only if all the values for particular id is 0?
– Ronak Shah
2 hours ago

only if they're all NA for a particular id
– Robert Hickman
2 hours ago

1

@RobertHickman There seems to be some confusion about your desired output. Could you update your question with the expected output based on this df1 <- data.frame(id = rep(c("a", "b","c"), each = 4), val = c(NA, NA, NA, NA, 1, 2, 2, 3,NA,2,NA,3)) ? Thanks to @VivekKalyanarangan for the data.
– markus
1 hour ago

add a comment |

So you want to add rows with 0 only if all the values for particular id is 0?
– Ronak Shah
2 hours ago

only if they're all NA for a particular id
– Robert Hickman
2 hours ago

1

@RobertHickman There seems to be some confusion about your desired output. Could you update your question with the expected output based on this df1 <- data.frame(id = rep(c("a", "b","c"), each = 4), val = c(NA, NA, NA, NA, 1, 2, 2, 3,NA,2,NA,3)) ? Thanks to @VivekKalyanarangan for the data.
– markus
1 hour ago

So you want to add rows with 0 only if all the values for particular id is 0?
– Ronak Shah
2 hours ago

only if they're all NA for a particular id
– Robert Hickman
2 hours ago

@RobertHickman There seems to be some confusion about your desired output. Could you update your question with the expected output based on this df1 <- data.frame(id = rep(c("a", "b","c"), each = 4), val = c(NA, NA, NA, NA, 1, 2, 2, 3,NA,2,NA,3)) ? Thanks to @VivekKalyanarangan for the data.
– markus
1 hour ago

add a comment |

7 Answers
7

active

oldest

votes

Another idea using dplyr,

library(dplyr)



df1 %>% 

 group_by(id) %>% 

 mutate(val = ifelse(row_number() == 1 & all(is.na(val)), 0, val)) %>% 

 na.omit()

which gives,

# A tibble: 5 x 2

# Groups:   id [2]

  id      val

  <fct> <dbl>

1 a         0

2 b         1

3 b         2

4 b         2

5 b         3

answered 1 hour ago

Sotos

28.1k51640

1

(+1) Seems like the most robust answer here. Would be marginally more concise using replace(val, all(is.na(val)) * 1, 0) instead of the ifelse(...).
– Mikko Marttila
38 mins ago

@MikkoMarttila Good suggestion. I usually try and avoid ifelse in general
– Sotos
33 mins ago

add a comment |

df1[is.na(df1)] <- 0

df1[!(duplicated(df1$id) & df1$val == 0), ]



  id val

1  a   0

5  b   1

6  b   2

7  b   2

8  b   3

answered 2 hours ago

Adamm

832517

5

Would this work for ids that contain NAs and non-NAs? Try with df1 <- data.frame(id = rep(c("a", "b"), each = 2), val = c(NA, 1, 2, 3))
– markus
2 hours ago

I think this is the best so far (I'll leave it open for another hour or so to see) would maybe change to df %>% replace(is.na(.), 0) %>% .[!(duplicated(.$id) & .$val == 0), ]
– Robert Hickman
1 hour ago

add a comment |

We may do

df1 %>% group_by(id) %>% do(if(all(is.na(.$val))) replace(.[1, ], 2, 0) else na.omit(.))

# A tibble: 5 x 2

# Groups:   id [2]

#   id      val

#   <fct> <dbl>

# 1 a         0

# 2 b         1

# 3 b         2

# 4 b         2

# 5 b         3

After grouping by id, if everything in val is NA, then we leave only the first row with the second element replaced by 0, otherwise the same data is returned after applying na.omit.

In a more readable format that would be

df1 %>% group_by(id) %>% 

  do(if(all(is.na(.$val))) data.frame(id = .$id[1], val = 0) else na.omit(.))

(Here I presume that you indeed want to get rid of all NA values; otherwise there is no need for na.omit.)

edited 1 hour ago

answered 1 hour ago

Julius Vainora

32.6k75979

1

@markus, right, I had assumed that that's the goal. Thanks!
– Julius Vainora
1 hour ago

It looks like op wants to retain the first row and replace the val column of that row with 0 where all val is NA for a group. Check my ans pls. Agree with @markus, it does seem tricky
– Vivek Kalyanarangan
1 hour ago

1

@VivekKalyanarangan, that's what I initially thought, but "and I want to get rid of all the NA values" suggests otherwise.
– Julius Vainora
1 hour ago

add a comment |

Base R option is to find groups with all NAs and transform them by changing their val to 0 and select only unique rows so that there is only one row per group. We rbind this dataframe with the groups which are !all_NA.

all_NA <- with(df1, ave(is.na(val), id, FUN = all))

rbind(unique(transform(df1[all_NA, ], val = 0)), df1[!all_NA, ])



#  id val

#1  a   0

#5  b   1

#6  b   2

#7  b   2

#8  b   3

dplyr option looks ugly but one way is to make two groups of dataframes one with groups of all NA values and other with groups of all non-NA values. For groups with all NA values we add row with it's id and val as 0 and bind this to the other group.

library(dplyr)



bind_rows(df1 %>%

            group_by(id) %>%

            filter(all(!is.na(val))), 

          df1 %>%

             group_by(id) %>%

             filter(all(is.na(val))) %>%

             ungroup() %>%

             summarise(id = unique(id), 

                       val = 0)) %>%

arrange(id)





#   id      val

#  <fct> <dbl>

#1  a         0

#2  b         1

#3  b         2

#4  b         2

#5  b         3

edited 1 hour ago

answered 2 hours ago

Ronak Shah

32.6k103753

add a comment |

Changed the df to make example more exhaustive -

df1 <- data.frame(id = rep(c("a", "b","c"), each = 4),

                  val = c(NA, NA, NA, NA, 1, 2, 2, 3,NA,2,NA,3))

library(dplyr)

df1 %>%

  group_by(id) %>%

  mutate(case=sum(is.na(val))==n(), row_num=row_number() ) %>%

  mutate(val=ifelse(is.na(val)&case,0,val)) %>%

  filter( !(case&row_num!=1) ) %>%

  select(id, val)

Output

  id      val

  <fct> <dbl>

1 a         0

2 b         1

3 b         2

4 b         2

5 b         3

6 c        NA

7 c         2

8 c        NA

9 c         3

answered 1 hour ago

Vivek Kalyanarangan

4,8811827

add a comment |

Here is an option too:

df1 %>% 

  mutate_if(is.factor,as.character) %>% 

 mutate_all(funs(replace(.,is.na(.),0))) %>% 

  slice(4:nrow(.))

This gives:

Alternative:

df1 %>% 

  mutate_if(is.factor,as.character) %>% 

 mutate_all(funs(replace(.,is.na(.),0))) %>% 

  unique()

edited 1 hour ago

answered 1 hour ago

NelsonGon

815217

3

where did 4 come from?
– Sotos
1 hour ago

The solution produces four 0s. We're only interested in having 1?
– NelsonGon
1 hour ago

What if one group has 4 and another 3?
– Sotos
1 hour ago

Sorry I only answered based on the question. Maybe then we could twist things up, not sure though!
– NelsonGon
1 hour ago

Consider this example - df1 <- data.frame(id = rep(c("a", "b","c"), each = 4), val = c(NA, NA, NA, NA, 1, 2, 2, 3,NA,2,NA,3)) I think here OP wants to remove NA values for A group only, not the rest
– Vivek Kalyanarangan
1 hour ago

|
show 2 more comments

Here is a base R solution.

res <- lapply(split(df1, df1$id), function(DF){

  if(anyNA(DF$val)) {

    i <- is.na(DF$val)

    DF$val[i] <- 0

    DF <- rbind(DF[i & !duplicated(DF[i, ]), ], DF[!i, ])

  }

  DF

})

res <- do.call(rbind, res)

row.names(res) <- NULL

res

#  id val

#1  a   0

#2  b   1

#3  b   2

#4  b   2

#5  b   3

Edit.

A dplyr solution could be the following.
It was tested with the original dataset posted by the OP, with the dataset in Vivek Kalyanarangan's answer and with the dataset in markus' comment, renamed df2 and df3, respectively.

library(dplyr)



na2zero <- function(DF){

  DF %>%

    group_by(id) %>%

    mutate(val = ifelse(is.na(val), 0, val),

           crit = val == 0 & duplicated(val)) %>%

    filter(!crit) %>%

    select(-crit)

}



na2zero(df1)

na2zero(df2)

na2zero(df3)

edited 46 mins ago

answered 2 hours ago

Rui Barradas

16.1k41730

Rui, try with df1 <- data.frame(id = rep(c("a", "b"), each = 2), val = c(NA, 1, 2, 3)). Unfortunately your solution doesn't return a data frame with only three rows.
– markus
1 hour ago

@markus No, it doesn't. The NA is replaced by a 0 and the other value of val is not NA so both must be in the output. At least that's how I'm understanding the OP's problem.
– Rui Barradas
1 hour ago

Fair enough. People are reading the question differently.
– markus
17 mins ago

add a comment |

Your Answer

StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});

}
});

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f54022536%2freplace-all-na-values-for-variable-with-one-row-equal-to-0%23new-answer', 'question_page');
}
);

Post as a guest

Name

Required, but never shown

7 Answers
7

active

oldest

votes

7 Answers
7

active

oldest

votes

Another idea using dplyr,

library(dplyr)



df1 %>% 

 group_by(id) %>% 

 mutate(val = ifelse(row_number() == 1 & all(is.na(val)), 0, val)) %>% 

 na.omit()

which gives,

# A tibble: 5 x 2

# Groups:   id [2]

  id      val

  <fct> <dbl>

1 a         0

2 b         1

3 b         2

4 b         2

5 b         3

answered 1 hour ago

Sotos

28.1k51640

1

(+1) Seems like the most robust answer here. Would be marginally more concise using replace(val, all(is.na(val)) * 1, 0) instead of the ifelse(...).
– Mikko Marttila
38 mins ago

@MikkoMarttila Good suggestion. I usually try and avoid ifelse in general
– Sotos
33 mins ago

add a comment |

Another idea using dplyr,

library(dplyr)



df1 %>% 

 group_by(id) %>% 

 mutate(val = ifelse(row_number() == 1 & all(is.na(val)), 0, val)) %>% 

 na.omit()

which gives,

# A tibble: 5 x 2

# Groups:   id [2]

  id      val

  <fct> <dbl>

1 a         0

2 b         1

3 b         2

4 b         2

5 b         3

answered 1 hour ago

Sotos

28.1k51640

1

(+1) Seems like the most robust answer here. Would be marginally more concise using replace(val, all(is.na(val)) * 1, 0) instead of the ifelse(...).
– Mikko Marttila
38 mins ago

@MikkoMarttila Good suggestion. I usually try and avoid ifelse in general
– Sotos
33 mins ago

add a comment |

Another idea using dplyr,

library(dplyr)



df1 %>% 

 group_by(id) %>% 

 mutate(val = ifelse(row_number() == 1 & all(is.na(val)), 0, val)) %>% 

 na.omit()

which gives,

# A tibble: 5 x 2

# Groups:   id [2]

  id      val

  <fct> <dbl>

1 a         0

2 b         1

3 b         2

4 b         2

5 b         3

answered 1 hour ago

Sotos

28.1k51640

Another idea using dplyr,

library(dplyr)



df1 %>% 

 group_by(id) %>% 

 mutate(val = ifelse(row_number() == 1 & all(is.na(val)), 0, val)) %>% 

 na.omit()

which gives,

# A tibble: 5 x 2

# Groups:   id [2]

  id      val

  <fct> <dbl>

1 a         0

2 b         1

3 b         2

4 b         2

5 b         3

answered 1 hour ago

Sotos

28.1k51640

answered 1 hour ago

Sotos

28.1k51640

answered 1 hour ago

Sotos

28.1k51640

answered 1 hour ago

Sotos

28.1k51640

1

(+1) Seems like the most robust answer here. Would be marginally more concise using replace(val, all(is.na(val)) * 1, 0) instead of the ifelse(...).
– Mikko Marttila
38 mins ago

@MikkoMarttila Good suggestion. I usually try and avoid ifelse in general
– Sotos
33 mins ago

add a comment |

1

(+1) Seems like the most robust answer here. Would be marginally more concise using replace(val, all(is.na(val)) * 1, 0) instead of the ifelse(...).
– Mikko Marttila
38 mins ago

@MikkoMarttila Good suggestion. I usually try and avoid ifelse in general
– Sotos
33 mins ago

(+1) Seems like the most robust answer here. Would be marginally more concise using replace(val, all(is.na(val)) * 1, 0) instead of the ifelse(...).
– Mikko Marttila
38 mins ago

@MikkoMarttila Good suggestion. I usually try and avoid ifelse in general
– Sotos
33 mins ago

add a comment |

df1[is.na(df1)] <- 0

df1[!(duplicated(df1$id) & df1$val == 0), ]



  id val

1  a   0

5  b   1

6  b   2

7  b   2

8  b   3

answered 2 hours ago

Adamm

832517

5

Would this work for ids that contain NAs and non-NAs? Try with df1 <- data.frame(id = rep(c("a", "b"), each = 2), val = c(NA, 1, 2, 3))
– markus
2 hours ago

I think this is the best so far (I'll leave it open for another hour or so to see) would maybe change to df %>% replace(is.na(.), 0) %>% .[!(duplicated(.$id) & .$val == 0), ]
– Robert Hickman
1 hour ago

add a comment |

df1[is.na(df1)] <- 0

df1[!(duplicated(df1$id) & df1$val == 0), ]



  id val

1  a   0

5  b   1

6  b   2

7  b   2

8  b   3

answered 2 hours ago

Adamm

832517

5

Would this work for ids that contain NAs and non-NAs? Try with df1 <- data.frame(id = rep(c("a", "b"), each = 2), val = c(NA, 1, 2, 3))
– markus
2 hours ago

I think this is the best so far (I'll leave it open for another hour or so to see) would maybe change to df %>% replace(is.na(.), 0) %>% .[!(duplicated(.$id) & .$val == 0), ]
– Robert Hickman
1 hour ago

add a comment |

df1[is.na(df1)] <- 0

df1[!(duplicated(df1$id) & df1$val == 0), ]



  id val

1  a   0

5  b   1

6  b   2

7  b   2

8  b   3

answered 2 hours ago

Adamm

832517

df1[is.na(df1)] <- 0

df1[!(duplicated(df1$id) & df1$val == 0), ]



  id val

1  a   0

5  b   1

6  b   2

7  b   2

8  b   3

answered 2 hours ago

Adamm

832517

answered 2 hours ago

Adamm

832517

answered 2 hours ago

Adamm

832517

answered 2 hours ago

Adamm

832517

5

Would this work for ids that contain NAs and non-NAs? Try with df1 <- data.frame(id = rep(c("a", "b"), each = 2), val = c(NA, 1, 2, 3))
– markus
2 hours ago

I think this is the best so far (I'll leave it open for another hour or so to see) would maybe change to df %>% replace(is.na(.), 0) %>% .[!(duplicated(.$id) & .$val == 0), ]
– Robert Hickman
1 hour ago

add a comment |

5

Would this work for ids that contain NAs and non-NAs? Try with df1 <- data.frame(id = rep(c("a", "b"), each = 2), val = c(NA, 1, 2, 3))
– markus
2 hours ago

I think this is the best so far (I'll leave it open for another hour or so to see) would maybe change to df %>% replace(is.na(.), 0) %>% .[!(duplicated(.$id) & .$val == 0), ]
– Robert Hickman
1 hour ago

Would this work for ids that contain NAs and non-NAs? Try with df1 <- data.frame(id = rep(c("a", "b"), each = 2), val = c(NA, 1, 2, 3))
– markus
2 hours ago

I think this is the best so far (I'll leave it open for another hour or so to see) would maybe change to df %>% replace(is.na(.), 0) %>% .[!(duplicated(.$id) & .$val == 0), ]
– Robert Hickman
1 hour ago

add a comment |

We may do

df1 %>% group_by(id) %>% do(if(all(is.na(.$val))) replace(.[1, ], 2, 0) else na.omit(.))

# A tibble: 5 x 2

# Groups:   id [2]

#   id      val

#   <fct> <dbl>

# 1 a         0

# 2 b         1

# 3 b         2

# 4 b         2

# 5 b         3

After grouping by id, if everything in val is NA, then we leave only the first row with the second element replaced by 0, otherwise the same data is returned after applying na.omit.

In a more readable format that would be

df1 %>% group_by(id) %>% 

  do(if(all(is.na(.$val))) data.frame(id = .$id[1], val = 0) else na.omit(.))

(Here I presume that you indeed want to get rid of all NA values; otherwise there is no need for na.omit.)

edited 1 hour ago

answered 1 hour ago

Julius Vainora

32.6k75979

1

@markus, right, I had assumed that that's the goal. Thanks!
– Julius Vainora
1 hour ago

It looks like op wants to retain the first row and replace the val column of that row with 0 where all val is NA for a group. Check my ans pls. Agree with @markus, it does seem tricky
– Vivek Kalyanarangan
1 hour ago

1

@VivekKalyanarangan, that's what I initially thought, but "and I want to get rid of all the NA values" suggests otherwise.
– Julius Vainora
1 hour ago

add a comment |

We may do

df1 %>% group_by(id) %>% do(if(all(is.na(.$val))) replace(.[1, ], 2, 0) else na.omit(.))

# A tibble: 5 x 2

# Groups:   id [2]

#   id      val

#   <fct> <dbl>

# 1 a         0

# 2 b         1

# 3 b         2

# 4 b         2

# 5 b         3

After grouping by id, if everything in val is NA, then we leave only the first row with the second element replaced by 0, otherwise the same data is returned after applying na.omit.

In a more readable format that would be

df1 %>% group_by(id) %>% 

  do(if(all(is.na(.$val))) data.frame(id = .$id[1], val = 0) else na.omit(.))

(Here I presume that you indeed want to get rid of all NA values; otherwise there is no need for na.omit.)

edited 1 hour ago

answered 1 hour ago

Julius Vainora

32.6k75979

1

@markus, right, I had assumed that that's the goal. Thanks!
– Julius Vainora
1 hour ago

It looks like op wants to retain the first row and replace the val column of that row with 0 where all val is NA for a group. Check my ans pls. Agree with @markus, it does seem tricky
– Vivek Kalyanarangan
1 hour ago

1

@VivekKalyanarangan, that's what I initially thought, but "and I want to get rid of all the NA values" suggests otherwise.
– Julius Vainora
1 hour ago

add a comment |

We may do

df1 %>% group_by(id) %>% do(if(all(is.na(.$val))) replace(.[1, ], 2, 0) else na.omit(.))

# A tibble: 5 x 2

# Groups:   id [2]

#   id      val

#   <fct> <dbl>

# 1 a         0

# 2 b         1

# 3 b         2

# 4 b         2

# 5 b         3

After grouping by id, if everything in val is NA, then we leave only the first row with the second element replaced by 0, otherwise the same data is returned after applying na.omit.

In a more readable format that would be

df1 %>% group_by(id) %>% 

  do(if(all(is.na(.$val))) data.frame(id = .$id[1], val = 0) else na.omit(.))

(Here I presume that you indeed want to get rid of all NA values; otherwise there is no need for na.omit.)

edited 1 hour ago

answered 1 hour ago

Julius Vainora

32.6k75979

We may do

df1 %>% group_by(id) %>% do(if(all(is.na(.$val))) replace(.[1, ], 2, 0) else na.omit(.))

# A tibble: 5 x 2

# Groups:   id [2]

#   id      val

#   <fct> <dbl>

# 1 a         0

# 2 b         1

# 3 b         2

# 4 b         2

# 5 b         3

After grouping by id, if everything in val is NA, then we leave only the first row with the second element replaced by 0, otherwise the same data is returned after applying na.omit.

In a more readable format that would be

df1 %>% group_by(id) %>% 

  do(if(all(is.na(.$val))) data.frame(id = .$id[1], val = 0) else na.omit(.))

(Here I presume that you indeed want to get rid of all NA values; otherwise there is no need for na.omit.)

edited 1 hour ago

answered 1 hour ago

Julius Vainora

32.6k75979

edited 1 hour ago

answered 1 hour ago

Julius Vainora

32.6k75979

answered 1 hour ago

Julius Vainora

32.6k75979

answered 1 hour ago

Julius Vainora

32.6k75979

1

@markus, right, I had assumed that that's the goal. Thanks!
– Julius Vainora
1 hour ago

It looks like op wants to retain the first row and replace the val column of that row with 0 where all val is NA for a group. Check my ans pls. Agree with @markus, it does seem tricky
– Vivek Kalyanarangan
1 hour ago

1

@VivekKalyanarangan, that's what I initially thought, but "and I want to get rid of all the NA values" suggests otherwise.
– Julius Vainora
1 hour ago

add a comment |

1

@markus, right, I had assumed that that's the goal. Thanks!
– Julius Vainora
1 hour ago

It looks like op wants to retain the first row and replace the val column of that row with 0 where all val is NA for a group. Check my ans pls. Agree with @markus, it does seem tricky
– Vivek Kalyanarangan
1 hour ago

1

@VivekKalyanarangan, that's what I initially thought, but "and I want to get rid of all the NA values" suggests otherwise.
– Julius Vainora
1 hour ago

@markus, right, I had assumed that that's the goal. Thanks!
– Julius Vainora
1 hour ago

It looks like op wants to retain the first row and replace the val column of that row with 0 where all val is NA for a group. Check my ans pls. Agree with @markus, it does seem tricky
– Vivek Kalyanarangan
1 hour ago

@VivekKalyanarangan, that's what I initially thought, but "and I want to get rid of all the NA values" suggests otherwise.
– Julius Vainora
1 hour ago

add a comment |

all_NA <- with(df1, ave(is.na(val), id, FUN = all))

rbind(unique(transform(df1[all_NA, ], val = 0)), df1[!all_NA, ])



#  id val

#1  a   0

#5  b   1

#6  b   2

#7  b   2

#8  b   3

library(dplyr)



bind_rows(df1 %>%

            group_by(id) %>%

            filter(all(!is.na(val))), 

          df1 %>%

             group_by(id) %>%

             filter(all(is.na(val))) %>%

             ungroup() %>%

             summarise(id = unique(id), 

                       val = 0)) %>%

arrange(id)





#   id      val

#  <fct> <dbl>

#1  a         0

#2  b         1

#3  b         2

#4  b         2

#5  b         3

edited 1 hour ago

answered 2 hours ago

Ronak Shah

32.6k103753

add a comment |

all_NA <- with(df1, ave(is.na(val), id, FUN = all))

rbind(unique(transform(df1[all_NA, ], val = 0)), df1[!all_NA, ])



#  id val

#1  a   0

#5  b   1

#6  b   2

#7  b   2

#8  b   3

library(dplyr)



bind_rows(df1 %>%

            group_by(id) %>%

            filter(all(!is.na(val))), 

          df1 %>%

             group_by(id) %>%

             filter(all(is.na(val))) %>%

             ungroup() %>%

             summarise(id = unique(id), 

                       val = 0)) %>%

arrange(id)





#   id      val

#  <fct> <dbl>

#1  a         0

#2  b         1

#3  b         2

#4  b         2

#5  b         3

edited 1 hour ago

answered 2 hours ago

Ronak Shah

32.6k103753

add a comment |

all_NA <- with(df1, ave(is.na(val), id, FUN = all))

rbind(unique(transform(df1[all_NA, ], val = 0)), df1[!all_NA, ])



#  id val

#1  a   0

#5  b   1

#6  b   2

#7  b   2

#8  b   3

library(dplyr)



bind_rows(df1 %>%

            group_by(id) %>%

            filter(all(!is.na(val))), 

          df1 %>%

             group_by(id) %>%

             filter(all(is.na(val))) %>%

             ungroup() %>%

             summarise(id = unique(id), 

                       val = 0)) %>%

arrange(id)





#   id      val

#  <fct> <dbl>

#1  a         0

#2  b         1

#3  b         2

#4  b         2

#5  b         3

edited 1 hour ago

answered 2 hours ago

Ronak Shah

32.6k103753

all_NA <- with(df1, ave(is.na(val), id, FUN = all))

rbind(unique(transform(df1[all_NA, ], val = 0)), df1[!all_NA, ])



#  id val

#1  a   0

#5  b   1

#6  b   2

#7  b   2

#8  b   3

library(dplyr)



bind_rows(df1 %>%

            group_by(id) %>%

            filter(all(!is.na(val))), 

          df1 %>%

             group_by(id) %>%

             filter(all(is.na(val))) %>%

             ungroup() %>%

             summarise(id = unique(id), 

                       val = 0)) %>%

arrange(id)





#   id      val

#  <fct> <dbl>

#1  a         0

#2  b         1

#3  b         2

#4  b         2

#5  b         3

edited 1 hour ago

answered 2 hours ago

Ronak Shah

32.6k103753

edited 1 hour ago

answered 2 hours ago

Ronak Shah

32.6k103753

answered 2 hours ago

Ronak Shah

32.6k103753

answered 2 hours ago

Ronak Shah

32.6k103753

add a comment |

Changed the df to make example more exhaustive -

df1 <- data.frame(id = rep(c("a", "b","c"), each = 4),

                  val = c(NA, NA, NA, NA, 1, 2, 2, 3,NA,2,NA,3))

library(dplyr)

df1 %>%

  group_by(id) %>%

  mutate(case=sum(is.na(val))==n(), row_num=row_number() ) %>%

  mutate(val=ifelse(is.na(val)&case,0,val)) %>%

  filter( !(case&row_num!=1) ) %>%

  select(id, val)

Output

  id      val

  <fct> <dbl>

1 a         0

2 b         1

3 b         2

4 b         2

5 b         3

6 c        NA

7 c         2

8 c        NA

9 c         3

answered 1 hour ago

Vivek Kalyanarangan

4,8811827

add a comment |

Changed the df to make example more exhaustive -

df1 <- data.frame(id = rep(c("a", "b","c"), each = 4),

                  val = c(NA, NA, NA, NA, 1, 2, 2, 3,NA,2,NA,3))

library(dplyr)

df1 %>%

  group_by(id) %>%

  mutate(case=sum(is.na(val))==n(), row_num=row_number() ) %>%

  mutate(val=ifelse(is.na(val)&case,0,val)) %>%

  filter( !(case&row_num!=1) ) %>%

  select(id, val)

Output

  id      val

  <fct> <dbl>

1 a         0

2 b         1

3 b         2

4 b         2

5 b         3

6 c        NA

7 c         2

8 c        NA

9 c         3

answered 1 hour ago

Vivek Kalyanarangan

4,8811827

add a comment |

Changed the df to make example more exhaustive -

df1 <- data.frame(id = rep(c("a", "b","c"), each = 4),

                  val = c(NA, NA, NA, NA, 1, 2, 2, 3,NA,2,NA,3))

library(dplyr)

df1 %>%

  group_by(id) %>%

  mutate(case=sum(is.na(val))==n(), row_num=row_number() ) %>%

  mutate(val=ifelse(is.na(val)&case,0,val)) %>%

  filter( !(case&row_num!=1) ) %>%

  select(id, val)

Output

  id      val

  <fct> <dbl>

1 a         0

2 b         1

3 b         2

4 b         2

5 b         3

6 c        NA

7 c         2

8 c        NA

9 c         3

answered 1 hour ago

Vivek Kalyanarangan

4,8811827

Changed the df to make example more exhaustive -

df1 <- data.frame(id = rep(c("a", "b","c"), each = 4),

                  val = c(NA, NA, NA, NA, 1, 2, 2, 3,NA,2,NA,3))

library(dplyr)

df1 %>%

  group_by(id) %>%

  mutate(case=sum(is.na(val))==n(), row_num=row_number() ) %>%

  mutate(val=ifelse(is.na(val)&case,0,val)) %>%

  filter( !(case&row_num!=1) ) %>%

  select(id, val)

Output

  id      val

  <fct> <dbl>

1 a         0

2 b         1

3 b         2

4 b         2

5 b         3

6 c        NA

7 c         2

8 c        NA

9 c         3

answered 1 hour ago

Vivek Kalyanarangan

4,8811827

answered 1 hour ago

Vivek Kalyanarangan

4,8811827

answered 1 hour ago

Vivek Kalyanarangan

4,8811827

answered 1 hour ago

Vivek Kalyanarangan

4,8811827

add a comment |

Here is an option too:

df1 %>% 

  mutate_if(is.factor,as.character) %>% 

 mutate_all(funs(replace(.,is.na(.),0))) %>% 

  slice(4:nrow(.))

This gives:

Alternative:

df1 %>% 

  mutate_if(is.factor,as.character) %>% 

 mutate_all(funs(replace(.,is.na(.),0))) %>% 

  unique()

edited 1 hour ago

answered 1 hour ago

NelsonGon

815217

3

where did 4 come from?
– Sotos
1 hour ago

The solution produces four 0s. We're only interested in having 1?
– NelsonGon
1 hour ago

What if one group has 4 and another 3?
– Sotos
1 hour ago

Sorry I only answered based on the question. Maybe then we could twist things up, not sure though!
– NelsonGon
1 hour ago

Consider this example - df1 <- data.frame(id = rep(c("a", "b","c"), each = 4), val = c(NA, NA, NA, NA, 1, 2, 2, 3,NA,2,NA,3)) I think here OP wants to remove NA values for A group only, not the rest
– Vivek Kalyanarangan
1 hour ago

|
show 2 more comments

Here is an option too:

df1 %>% 

  mutate_if(is.factor,as.character) %>% 

 mutate_all(funs(replace(.,is.na(.),0))) %>% 

  slice(4:nrow(.))

This gives:

Alternative:

df1 %>% 

  mutate_if(is.factor,as.character) %>% 

 mutate_all(funs(replace(.,is.na(.),0))) %>% 

  unique()

edited 1 hour ago

answered 1 hour ago

NelsonGon

815217

3

where did 4 come from?
– Sotos
1 hour ago

The solution produces four 0s. We're only interested in having 1?
– NelsonGon
1 hour ago

What if one group has 4 and another 3?
– Sotos
1 hour ago

Sorry I only answered based on the question. Maybe then we could twist things up, not sure though!
– NelsonGon
1 hour ago

Consider this example - df1 <- data.frame(id = rep(c("a", "b","c"), each = 4), val = c(NA, NA, NA, NA, 1, 2, 2, 3,NA,2,NA,3)) I think here OP wants to remove NA values for A group only, not the rest
– Vivek Kalyanarangan
1 hour ago

|
show 2 more comments

Here is an option too:

df1 %>% 

  mutate_if(is.factor,as.character) %>% 

 mutate_all(funs(replace(.,is.na(.),0))) %>% 

  slice(4:nrow(.))

This gives:

Alternative:

df1 %>% 

  mutate_if(is.factor,as.character) %>% 

 mutate_all(funs(replace(.,is.na(.),0))) %>% 

  unique()

edited 1 hour ago

answered 1 hour ago

NelsonGon

815217

Here is an option too:

df1 %>% 

  mutate_if(is.factor,as.character) %>% 

 mutate_all(funs(replace(.,is.na(.),0))) %>% 

  slice(4:nrow(.))

This gives:

Alternative:

df1 %>% 

  mutate_if(is.factor,as.character) %>% 

 mutate_all(funs(replace(.,is.na(.),0))) %>% 

  unique()

edited 1 hour ago

answered 1 hour ago

NelsonGon

815217

edited 1 hour ago

answered 1 hour ago

NelsonGon

815217

answered 1 hour ago

NelsonGon

815217

answered 1 hour ago

NelsonGon

815217

3

where did 4 come from?
– Sotos
1 hour ago

The solution produces four 0s. We're only interested in having 1?
– NelsonGon
1 hour ago

What if one group has 4 and another 3?
– Sotos
1 hour ago

Sorry I only answered based on the question. Maybe then we could twist things up, not sure though!
– NelsonGon
1 hour ago

Consider this example - df1 <- data.frame(id = rep(c("a", "b","c"), each = 4), val = c(NA, NA, NA, NA, 1, 2, 2, 3,NA,2,NA,3)) I think here OP wants to remove NA values for A group only, not the rest
– Vivek Kalyanarangan
1 hour ago

|
show 2 more comments

3

where did 4 come from?
– Sotos
1 hour ago

The solution produces four 0s. We're only interested in having 1?
– NelsonGon
1 hour ago

What if one group has 4 and another 3?
– Sotos
1 hour ago

Sorry I only answered based on the question. Maybe then we could twist things up, not sure though!
– NelsonGon
1 hour ago

Consider this example - df1 <- data.frame(id = rep(c("a", "b","c"), each = 4), val = c(NA, NA, NA, NA, 1, 2, 2, 3,NA,2,NA,3)) I think here OP wants to remove NA values for A group only, not the rest
– Vivek Kalyanarangan
1 hour ago

where did 4 come from?
– Sotos
1 hour ago

The solution produces four 0s. We're only interested in having 1?
– NelsonGon
1 hour ago

What if one group has 4 and another 3?
– Sotos
1 hour ago

Sorry I only answered based on the question. Maybe then we could twist things up, not sure though!
– NelsonGon
1 hour ago

Consider this example - df1 <- data.frame(id = rep(c("a", "b","c"), each = 4), val = c(NA, NA, NA, NA, 1, 2, 2, 3,NA,2,NA,3)) I think here OP wants to remove NA values for A group only, not the rest
– Vivek Kalyanarangan
1 hour ago

|
show 2 more comments

Here is a base R solution.

res <- lapply(split(df1, df1$id), function(DF){

  if(anyNA(DF$val)) {

    i <- is.na(DF$val)

    DF$val[i] <- 0

    DF <- rbind(DF[i & !duplicated(DF[i, ]), ], DF[!i, ])

  }

  DF

})

res <- do.call(rbind, res)

row.names(res) <- NULL

res

#  id val

#1  a   0

#2  b   1

#3  b   2

#4  b   2

#5  b   3

Edit.

library(dplyr)



na2zero <- function(DF){

  DF %>%

    group_by(id) %>%

    mutate(val = ifelse(is.na(val), 0, val),

           crit = val == 0 & duplicated(val)) %>%

    filter(!crit) %>%

    select(-crit)

}



na2zero(df1)

na2zero(df2)

na2zero(df3)

edited 46 mins ago

answered 2 hours ago

Rui Barradas

16.1k41730

Rui, try with df1 <- data.frame(id = rep(c("a", "b"), each = 2), val = c(NA, 1, 2, 3)). Unfortunately your solution doesn't return a data frame with only three rows.
– markus
1 hour ago

@markus No, it doesn't. The NA is replaced by a 0 and the other value of val is not NA so both must be in the output. At least that's how I'm understanding the OP's problem.
– Rui Barradas
1 hour ago

Fair enough. People are reading the question differently.
– markus
17 mins ago

add a comment |

Here is a base R solution.

res <- lapply(split(df1, df1$id), function(DF){

  if(anyNA(DF$val)) {

    i <- is.na(DF$val)

    DF$val[i] <- 0

    DF <- rbind(DF[i & !duplicated(DF[i, ]), ], DF[!i, ])

  }

  DF

})

res <- do.call(rbind, res)

row.names(res) <- NULL

res

#  id val

#1  a   0

#2  b   1

#3  b   2

#4  b   2

#5  b   3

Edit.

library(dplyr)



na2zero <- function(DF){

  DF %>%

    group_by(id) %>%

    mutate(val = ifelse(is.na(val), 0, val),

           crit = val == 0 & duplicated(val)) %>%

    filter(!crit) %>%

    select(-crit)

}



na2zero(df1)

na2zero(df2)

na2zero(df3)

edited 46 mins ago

answered 2 hours ago

Rui Barradas

16.1k41730

Rui, try with df1 <- data.frame(id = rep(c("a", "b"), each = 2), val = c(NA, 1, 2, 3)). Unfortunately your solution doesn't return a data frame with only three rows.
– markus
1 hour ago

@markus No, it doesn't. The NA is replaced by a 0 and the other value of val is not NA so both must be in the output. At least that's how I'm understanding the OP's problem.
– Rui Barradas
1 hour ago

Fair enough. People are reading the question differently.
– markus
17 mins ago

add a comment |

Here is a base R solution.

res <- lapply(split(df1, df1$id), function(DF){

  if(anyNA(DF$val)) {

    i <- is.na(DF$val)

    DF$val[i] <- 0

    DF <- rbind(DF[i & !duplicated(DF[i, ]), ], DF[!i, ])

  }

  DF

})

res <- do.call(rbind, res)

row.names(res) <- NULL

res

#  id val

#1  a   0

#2  b   1

#3  b   2

#4  b   2

#5  b   3

Edit.

library(dplyr)



na2zero <- function(DF){

  DF %>%

    group_by(id) %>%

    mutate(val = ifelse(is.na(val), 0, val),

           crit = val == 0 & duplicated(val)) %>%

    filter(!crit) %>%

    select(-crit)

}



na2zero(df1)

na2zero(df2)

na2zero(df3)

edited 46 mins ago

answered 2 hours ago

Rui Barradas

16.1k41730

Here is a base R solution.

res <- lapply(split(df1, df1$id), function(DF){

  if(anyNA(DF$val)) {

    i <- is.na(DF$val)

    DF$val[i] <- 0

    DF <- rbind(DF[i & !duplicated(DF[i, ]), ], DF[!i, ])

  }

  DF

})

res <- do.call(rbind, res)

row.names(res) <- NULL

res

#  id val

#1  a   0

#2  b   1

#3  b   2

#4  b   2

#5  b   3

Edit.

library(dplyr)



na2zero <- function(DF){

  DF %>%

    group_by(id) %>%

    mutate(val = ifelse(is.na(val), 0, val),

           crit = val == 0 & duplicated(val)) %>%

    filter(!crit) %>%

    select(-crit)

}



na2zero(df1)

na2zero(df2)

na2zero(df3)

edited 46 mins ago

answered 2 hours ago

Rui Barradas

16.1k41730

edited 46 mins ago

answered 2 hours ago

Rui Barradas

16.1k41730

answered 2 hours ago

Rui Barradas

16.1k41730

answered 2 hours ago

Rui Barradas

16.1k41730

Rui, try with df1 <- data.frame(id = rep(c("a", "b"), each = 2), val = c(NA, 1, 2, 3)). Unfortunately your solution doesn't return a data frame with only three rows.
– markus
1 hour ago

@markus No, it doesn't. The NA is replaced by a 0 and the other value of val is not NA so both must be in the output. At least that's how I'm understanding the OP's problem.
– Rui Barradas
1 hour ago

Fair enough. People are reading the question differently.
– markus
17 mins ago

add a comment |

Rui, try with df1 <- data.frame(id = rep(c("a", "b"), each = 2), val = c(NA, 1, 2, 3)). Unfortunately your solution doesn't return a data frame with only three rows.
– markus
1 hour ago

@markus No, it doesn't. The NA is replaced by a 0 and the other value of val is not NA so both must be in the output. At least that's how I'm understanding the OP's problem.
– Rui Barradas
1 hour ago

Fair enough. People are reading the question differently.
– markus
17 mins ago

Rui, try with df1 <- data.frame(id = rep(c("a", "b"), each = 2), val = c(NA, 1, 2, 3)). Unfortunately your solution doesn't return a data frame with only three rows.
– markus
1 hour ago

@markus No, it doesn't. The NA is replaced by a 0 and the other value of val is not NA so both must be in the output. At least that's how I'm understanding the OP's problem.
– Rui Barradas
1 hour ago

Fair enough. People are reading the question differently.
– markus
17 mins ago

add a comment |

draft saved

draft discarded

Thanks for contributing an answer to Stack Overflow!

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.

Some of your past answers have not been well-received, and you're in danger of being blocked from answering.

Please pay close attention to the following guidance:

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

搜尋此網誌

Ykhjuy