python - Retrieve one month of data based on minutes granurality in MongoDB -
def getpacketlength(self,groupid,lasttime,tapid): pkt_collection = db['tap_pkts'] reqtime= [1417177485,1417177500,........,1414585185] total_sum=0 resultset=[{"$match":{"groupid":int(groupid), "$and":[{"first_time":{"$lte":reqtime}},{"time": {"$gte":reqtime}}], "tapid":int(tapid)}}, {'$group':{'_id':'','packetlength':{'$sum':'$pkt_length'}}}] sum_resultset=pkt_collection.aggregate(pipeline=resultset) return
objective:
i need retrieve 1 month data each 5 minutes time interval.
i have timearray say, 2000 time intervals.
timearray=[1417177485,1417177500,........,1414585185]
from timearray each time must iterate , compare in db collection time in timearray must greater equal or less equal 'first_time' , 'time' fields in db.if condition statisfied sum 'pkt_length' field of collections , store in array. repeat time in timearray.
output: sumofpktlengths=[4,9,..........6]
facing problem: (performance issue)
now writing mongo query inside function , each value(time) in timearray calling function server execute query single value(time) , return output server each value(time) in timearray. hence if time timearray size 2000 server hit db f0r 2000 times. each time db , server interaction takes place reducing performance. instead of server db interaction every time, need iterate timearray each value(time) in query level , send array of values server.
output: sumofpktlengths=[4,9,..........6] return sumofpktslength
documents: communication related project
in mondodb collection contains below data here have included 1 document
db.collections.find()
{ "_id" : { "hash" : "d5229340b53f0493bb391c06f479da6d0ddb7327b7413414844c5309" }, "count" : 28719, "latency" : [ 0 ], "first_time" : 1416810981.083066, "cal_debug" : true, "tuple" : { "sip" : "192.162.2.1", "dip" : "192.162.2.2", "groupid" : 3, "proto" : 1 }, "tapid" : 5, "drop" : false, "pkt_length" : 102, "groupname" : "tapgroup-1", "swap" : false, "time" : 1417444470.313817, "path" : [ 5 ], "filter_bits" : 56, "groupid" : 3, "hash" : "d5229340b53f0493bb391c06f479da6d0ddb7327b7413414844c5309", "tapname" : "t-1"
}
the above 1 document in collection. document contains "first_time" , "time" fields. "time" greaterthan "first_time".
"first_time" start time of packet transfer , "time" finish time of packet transfer.
to draw graph of 1 month data taking currentdate-1month = reqtime
from reqtime every 5 min interval list of time can see below in reqtime array form.
reqtime=[17464544,............,171755378] contain time stamp 5 minutes intervals 1 month
from array of 'reqtime' each value in array compare whether time lies in between "first_time" , "time" if yes, sum "pkt_length" matching documents.then repeat same process next value in reqtime , on... till reaches last value in reqtime.
this process trying implement. instead of storing time in reqtime array comparing.. if feel other suggestion query take each 5 min interval 1 month.. can suggest me.
Comments
Post a Comment