. Advertisement .
..3..
. Advertisement .
..4..
I get the error: aggregation function missing: defaulting to length when I try to run the program below:
Id Task Type Freq
3 1 A 2
3 1 B 3
3 2 A 3
3 2 B 0
4 1 A 3
4 1 B 3
4 2 A 1
4 2 B 3
Id A B … Z
3 5 3
4 4 6
df_wide <- dcast(df, Id + Task ~ Type, value.var="Freq")
The error appears the system notifies as follows:
Aggregation function missing: defaulting to length
I tried to solve it with another sample. I got the reference in the community forum, but it still returned an invalid result. If someone knows the solution, please give me the support. Thanks!
The cause:
You have got this error because the
dcast
function which is in the reshape2 package is being used to change a data frame from a length to width format. However, there are more than one value which can be set in the single cell of the width data frame. When many values from thevalue.var
column that correspond to the same value(s) or combination of values appearing on the LHS of the dcast formula (for example, “Id”) are crammed into one cell by the combination of variables in the RHS of the calculation,fun.aggregate
is necessary (for example, “Type”).Solution:
The
length()
default value indcast
is instructive because it can point to the possibility of coupling in the data and identifylength > 1
cases that could need special attention.Using
list()
asfun.aggregate
would be more instructive because it displays whichvalue.var
values are related for each instance as the following:Table cells typically have a length of 1. Consequently, the defaulting issue in
dcast
can be resolved by changing the formula or by putting in place a length-one summarization (aggregation): operators, custom or ready-made functions that deliver a length-one result in each case and are appropriate for the task.This warning is caused by
fun.aggregate
(see?dcast
).An aggregation function is required when more than one value is available for a single spot in the large dataframe.
Based on your data,
Use
dcast(df, Id + Task ~ Type, value.var="Freq")
to get:This is because
Id
,Task
andType
only have one value inFreq
. This warning message is also displayed when you usedcast(df, Id ~ Type, value.var="Freq")
Let’s now look at the top portion of your data.
This is why it is. There are two codes in
Freq
for each combination ofId
&Type
(for Id 32
&3
for typeA
& Type0
), while only one can be put in this spot in wide dataframe for each value oftype
.dcast
will combine these values into one value.length
is the default aggregation function. However, you can specify other functions such assum
,mean
orsd
by usingfun.aggregate
.fun.aggregate = sum
, for example, gives you:There is no warning,
dcast
is being told exactly what to do when more than one value is present: return the total of all values.