Excel Help.

GEC: Discuss gaming, computers and electronics and venture into the bizarre world of STGODs.

Moderator: Thanas

Post Reply
User avatar
Maxentius
Padawan Learner
Posts: 298
Joined: 2008-05-16 04:12pm
Location: New York City
Contact:

Excel Help.

Post by Maxentius »

I have two spreadsheets, both of which are listings of bars and clubs and the like. One of them is the listing for the entire Los Angeles metropolitan area (e.g. Burbank, Compton, Long Beach, etc), while the other is just the listing for the city of Los Angeles as a municipal entity (i.e. Downtown Los Angeles).

Basically, what I need to do, is find out which of the entries from the City of LA spreadsheet are already in the Greater LA one, as I know there are some missing (the Greater LA .xls is organized by city; it has 3,080 entries for LA, and the City of LA spreadsheet only has 2,868 entries). Those entries then need to be deleted, en-masse is possible.

Is there any way to do this? Was the previous, convoluted paragraph even intelligible? =P Obviously, there's always the Mark I Eyeball method of manually going back and forth from spreadsheet to spreadsheet, Ctrl+Fing my way through and deleting as I go, but if there's a macro/script way to go about this, I would be eternally indebted to whichever of ye sagely gurus is the keeper of such eldritch information.
Rome is an eternal thought in the mind of God... If there were no Rome, I'd dream of her.
--Marcus Licinius Crassus, Spartacus.

User avatar
General Zod
Never Shuts Up
Posts: 29211
Joined: 2003-11-18 03:08pm
Location: The Clearance Rack
Contact:

Post by General Zod »

I don't suppose you could post screenshots? There might be a vlookup & if statement formula combination that can be used but it's hard to give an exact formula without seeing the sheet itself. As long as everything follows some sort of logical structure it should be relatively simple.
"It's you Americans. There's something about nipples you hate. If this were Germany, we'd be romping around naked on the stage here."
User avatar
phongn
Rebel Leader
Posts: 18487
Joined: 2002-07-03 11:11pm

Post by phongn »

If you know that the rows for duplicates will be identical in each spreadsheet, you could merge the two together and then do something like this to clear it out.

Also, if you're dealing with this much data, I strongly suggest you learn how to use a database. Excel is not very good for this kind of work.
User avatar
General Zod
Never Shuts Up
Posts: 29211
Joined: 2003-11-18 03:08pm
Location: The Clearance Rack
Contact:

Post by General Zod »

phongn wrote:If you know that the rows for duplicates will be identical in each spreadsheet, you could merge the two together and then do something like this to clear it out.
If the data's fairly uniform, it's a matter of inserting a single column and performing a vlookup off one of the cells, then copying the formula down. Excel is only really a bitch when it comes to using data that's jumbled and not very orderly.
Last edited by General Zod on 2008-06-04 11:13pm, edited 3 times in total.
"It's you Americans. There's something about nipples you hate. If this were Germany, we'd be romping around naked on the stage here."
User avatar
phongn
Rebel Leader
Posts: 18487
Joined: 2002-07-03 11:11pm

Post by phongn »

General Zod wrote:If the data's fairly uniform, it's a matter of inserting a single column and performing a vlookup off one of the cells, then copying the formula down. Excel is only really a bitch when it comes to using data that's jumbled and not very orderly.
Well, there are sorting methods that help there, but using Access probably would've made his life a lot easier.
User avatar
Maxentius
Padawan Learner
Posts: 298
Joined: 2008-05-16 04:12pm
Location: New York City
Contact:

Post by Maxentius »

phongn wrote:If you know that the rows for duplicates will be identical in each spreadsheet, you could merge the two together and then do something like this to clear it out.

Also, if you're dealing with this much data, I strongly suggest you learn how to use a database. Excel is not very good for this kind of work.
That was actually the first question I asked my boss when I got this job, if they used any kind of database. Unfortunately, the only DB is the one on the website, and I'm dealing with the raw data, because (apparently) it's uploaded directly from spreadsheet format.

I'll post a few screenshots when I'm at work tomorrow. I'm pretty much a nub with Excel as far as it goes past the basic functions. Fortunately, the data is incredibly uniform, the same tabling format is shared between both sheets. Essentially, the goal is to remove the entries already in the City of Los Angeles (which we've already dealt with, as far as my job goes) spreadsheet from the Greater Los Angeles one, so we can see which ones are left over - those are the ones that we need to work on.
Rome is an eternal thought in the mind of God... If there were no Rome, I'd dream of her.
--Marcus Licinius Crassus, Spartacus.

User avatar
General Zod
Never Shuts Up
Posts: 29211
Joined: 2003-11-18 03:08pm
Location: The Clearance Rack
Contact:

Post by General Zod »

phongn wrote:If you know that the rows for duplicates will be identical in each spreadsheet, you could merge the two together and then do something like this to clear it out.
Actually, there's a much asier method for identifying duplicates. You'll need to insert two columns in front of the unique values you're trying to identify in each tab. In the first column you'll label it LA/City of LA respectively, then merge the sheets into a new tab so the data is aligned. Next you just use this code,

Code: Select all

=if(a2=a1,"DUPLICATE","OK")
copy it all the way down and it'll identify the duplicate values. Then you just take the duplicate column, copy/paste special value so it's only text and not the formula, and sort by that column. Get rid of the rows marked as duplicates at the bottom and bam. You're left with only the data you want to use and have a fair deal more control than doing something that automatically deletes.
"It's you Americans. There's something about nipples you hate. If this were Germany, we'd be romping around naked on the stage here."
User avatar
Maxentius
Padawan Learner
Posts: 298
Joined: 2008-05-16 04:12pm
Location: New York City
Contact:

Post by Maxentius »

Here's a few screenshots. I'll try what Zod suggested a bit later, once I've gotten some other work out of the way.

Image

Image
Rome is an eternal thought in the mind of God... If there were no Rome, I'd dream of her.
--Marcus Licinius Crassus, Spartacus.

User avatar
General Zod
Never Shuts Up
Posts: 29211
Joined: 2003-11-18 03:08pm
Location: The Clearance Rack
Contact:

Post by General Zod »

Okay. . .you'll definitely want to merge both sheets and sort by address. Although I'd recommend inserting a new column in both tabs (putting the value LA, the other City of LA in each column respectively), to distinguish the two before merging the lists so you can sort them out later and figure out what goes where after identifying duplicates. Then use the formula up above in a new column next to the address column (which seems like the best method for identifying dupes), so you can identify which values are duplicates and sort everything by city/city of la afterwords.
"It's you Americans. There's something about nipples you hate. If this were Germany, we'd be romping around naked on the stage here."
User avatar
The Jester
Padawan Learner
Posts: 475
Joined: 2005-05-30 08:34am
Location: Japan

Post by The Jester »

If you understand any programming, you can also use a macro.
User avatar
Solauren
Emperor's Hand
Posts: 10388
Joined: 2003-05-11 09:41pm

Post by Solauren »

Step #1 - Copy them both into a new spreadsheet, so you don't screw with your source data.
Step #2 - Insert Column "Source Spreadsheet"
Step #3 - With each spreadsheet, fill that column in with the original name. This will let you resort them back to source later. A quick copy and paste over the required cells will handle this quickly

Step #4 - Sort by Address as the primary, Name as the Secondary
Step #5 - Use this code. Address Column Letters as needed)

=IF(A2 = A1, "Possible Duplicate", "")
Put this in the furthest right column (after the data), for each record

Step #6 - Paste that column as Values. All duplicates are now marked.
Step #7 - Read over the records that are marked as possible duplicates. It's possible you'll have different Business Names in similiar locations.
Remove any 'possible duplicates' markers that don't pan out. Replace ones that do with 'Duplicate'

Step #7 - If you need them seperated with the duplicates removed. Sort the 'Duplicate' column you created in Step #5 + pasted in #6, and delete the duplicates.

Step #8 - If need be; Sort by Spreadsheet, and paste them back.
Post Reply